git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: Nicolas Pitre <nico@fluxnic.net>
Subject: Re: thin packs ending up fat
Date: Thu, 12 Jan 2012 17:32:34 -0500	[thread overview]
Message-ID: <20120112223234.GA4949@sigill.intra.peff.net> (raw)
In-Reply-To: <20120112221523.GA3663@sigill.intra.peff.net>

On Thu, Jan 12, 2012 at 05:15:23PM -0500, Jeff King wrote:

> It turns out that when packing a subset of a fully packed repo (as we do
> for a bundle or for a fetch), we tend not to make thin packs at all.
> The culprit is this logic in try_delta:
> 
>         /*
>          * We do not bother to try a delta that we discarded
>          * on an earlier try, but only when reusing delta data.
>          */
>         if (reuse_delta && trg_entry->in_pack &&
>             trg_entry->in_pack == src_entry->in_pack &&
>             trg_entry->in_pack_type != OBJ_REF_DELTA &&
>             trg_entry->in_pack_type != OBJ_OFS_DELTA)
>                 return 0;
> [...]
> Maybe it is enough to simply turn off this optimization if the potential
> delta source is not being included in the pack (i.e., we are using
> --thin and it is a boundary object). Because if both objects are being
> sent, we will just end up reusing the delta that goes in the reverse
> direction anyway.

Hmm. It turns out this is really easy, because we have already marked
such objects as preferred bases.

So with this patch:

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 96c1680..d05e228 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1439,6 +1439,7 @@ static int try_delta(struct unpacked *trg, struct unpacked *src,
 	 */
 	if (reuse_delta && trg_entry->in_pack &&
 	    trg_entry->in_pack == src_entry->in_pack &&
+	    !src_entry->preferred_base &&
 	    trg_entry->in_pack_type != OBJ_REF_DELTA &&
 	    trg_entry->in_pack_type != OBJ_OFS_DELTA)
 		return 0;

here are the numbers I get:

                  dataset
            | fetches | tags
---------------------------------
     before | 53358   | 2750977
size  after | 32398   | 2668479
     change |   -39%  |      -3%
---------------------------------
     before |  0.18   | 1.12
CPU   after |  0.18   | 1.15
     change |    +0%  |      +3%

So nearly all of the size benefit, but very little CPU change (even the
3% on the larger-pack case is close to the levels of run-to-run noise).
Obviously the size benefit in the larger-pack case isn't impressive, but
I think the "fetches" case is much more indicative of a real server
load.

-Peff

  reply	other threads:[~2012-01-12 22:32 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-12 22:15 thin packs ending up fat Jeff King
2012-01-12 22:32 ` Jeff King [this message]
2012-01-12 23:54   ` Nicolas Pitre
2012-01-13  0:14   ` Junio C Hamano
2012-01-13  1:31   ` Junio C Hamano
2012-01-13  1:51     ` Jeff King
2012-01-13  1:59       ` Jeff King
2012-01-13  7:19       ` Junio C Hamano
2012-01-13 15:15         ` Jeff King
2012-01-13  2:19     ` Nicolas Pitre
2012-01-13  8:28 ` Ævar Arnfjörð Bjarmason
2012-01-13 15:55   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120112223234.GA4949@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=nico@fluxnic.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).