From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff King Subject: [RFC/PATCH] gc: do not explode objects which will be immediately pruned Date: Sat, 7 Apr 2012 06:30:09 -0400 Message-ID: <20120407103008.GA11242@sigill.intra.peff.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: Nicolas Pitre To: git@vger.kernel.org X-From: git-owner@vger.kernel.org Sat Apr 07 12:30:28 2012 Return-path: Envelope-to: gcvg-git-2@plane.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SGSuZ-0000co-6I for gcvg-git-2@plane.gmane.org; Sat, 07 Apr 2012 12:30:27 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753033Ab2DGKaV (ORCPT ); Sat, 7 Apr 2012 06:30:21 -0400 Received: from 99-108-226-0.lightspeed.iplsin.sbcglobal.net ([99.108.226.0]:53373 "EHLO peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752860Ab2DGKaU (ORCPT ); Sat, 7 Apr 2012 06:30:20 -0400 Received: (qmail 23139 invoked by uid 107); 7 Apr 2012 10:30:21 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) (smtp-auth username relayok, mechanism cram-md5) by peff.net (qpsmtpd/0.84) with ESMTPA; Sat, 07 Apr 2012 06:30:21 -0400 Received: by sigill.intra.peff.net (sSMTP sendmail emulation); Sat, 07 Apr 2012 06:30:09 -0400 Content-Disposition: inline Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: When we pack everything into one big pack with "git repack -Ad", any unreferenced objects in to-be-deleted packs are exploded into loose objects, with the intent that they will be examined and possibly cleaned up by the next run of "git prune". Since the exploded objects will receive the mtime of the pack from which they come, if the source pack is old, those loose objects will end up pruned immediately. In that case, it is much more efficient to skip the exploding step entirely for these objects. This patch teaches pack-objects to receive the expiration information and avoid writing these objects out. It also teaches "git gc" to pass the value of gc.pruneexpire to repack (which in turn learns to pass it along to pack-objects) so that this optimization happens automatically during "git gc" and "git gc --auto". Signed-off-by: Jeff King --- builtin/gc.c | 23 +++++++++++++++-------- builtin/pack-objects.c | 25 +++++++++++++++++++++++-- git-repack.sh | 10 +++++++++- t/t7701-repack-unpack-unreachable.sh | 14 ++++++++++++++ 4 files changed, 61 insertions(+), 11 deletions(-) diff --git a/builtin/gc.c b/builtin/gc.c index 271376d..1bc2fe3 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -144,6 +144,19 @@ static int too_many_packs(void) return gc_auto_pack_limit <= cnt; } +static void add_repack_all_option(void) +{ + if (prune_expire && !strcmp(prune_expire, "now")) + append_option(argv_repack, "-a", MAX_ADD); + else { + append_option(argv_repack, "-A", MAX_ADD); + if (prune_expire) { + append_option(argv_repack, "--unpack-unreachable", MAX_ADD); + append_option(argv_repack, prune_expire, MAX_ADD); + } + } +} + static int need_to_gc(void) { /* @@ -160,10 +173,7 @@ static int need_to_gc(void) * there is no need. */ if (too_many_packs()) - append_option(argv_repack, - prune_expire && !strcmp(prune_expire, "now") ? - "-a" : "-A", - MAX_ADD); + add_repack_all_option(); else if (!too_many_loose_objects()) return 0; @@ -227,10 +237,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix) "run \"git gc\" manually. See " "\"git help gc\" for more information.\n")); } else - append_option(argv_repack, - prune_expire && !strcmp(prune_expire, "now") - ? "-a" : "-A", - MAX_ADD); + add_repack_all_option(); if (pack_refs && run_command_v_opt(argv_pack_refs, RUN_GIT_CMD)) return error(FAILED_RUN, argv_pack_refs[0]); diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 7b07c09..1861093 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -63,6 +63,7 @@ static uint32_t nr_objects, nr_alloc, nr_result, nr_written; static int non_empty; static int reuse_delta = 1, reuse_object = 1; static int keep_unreachable, unpack_unreachable, include_tag; +static unsigned long unpack_unreachable_expiration; static int local; static int incremental; static int ignore_packed_keep; @@ -2249,6 +2250,10 @@ static void loosen_unused_packed_objects(struct rev_info *revs) if (!p->pack_local || p->pack_keep) continue; + if (unpack_unreachable_expiration && + p->mtime < unpack_unreachable_expiration) + continue; + if (open_pack_index(p)) die("cannot open pack index"); @@ -2315,6 +2320,21 @@ static int option_parse_index_version(const struct option *opt, return 0; } +static int option_parse_unpack_unreachable(const struct option *opt, + const char *arg, int unset) +{ + if (unset) { + unpack_unreachable = 0; + unpack_unreachable_expiration = 0; + } + else { + unpack_unreachable = 1; + if (arg) + unpack_unreachable_expiration = approxidate(arg); + } + return 0; +} + static int option_parse_ulong(const struct option *opt, const char *arg, int unset) { @@ -2392,8 +2412,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) "include tag objects that refer to objects to be packed"), OPT_BOOL(0, "keep-unreachable", &keep_unreachable, "keep unreachable objects"), - OPT_BOOL(0, "unpack-unreachable", &unpack_unreachable, - "unpack unreachable objects"), + { OPTION_CALLBACK, 0, "unpack-unreachable", NULL, "time", + "unpack unreachable objects newer than