git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Brad Litterell <brad@evidence.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Question about how git determines the minimum packfile for a push.
Date: Tue, 28 Apr 2015 01:33:32 -0400	[thread overview]
Message-ID: <20150428053332.GH24580@peff.net> (raw)
In-Reply-To: <E51394554A503C4E852F9BEE46B03E8D01E4E784@TI-ODIN.tasernet.com>

On Mon, Apr 27, 2015 at 12:41:28AM +0000, Brad Litterell wrote:

> Is it possible git is not computing the delta correctly?  Or does git
> only look at the top-level commit objects to figure out what to
> include in the push packfile?

It's the latter. Junio mentioned that "push" is not as thorough about
finding common ancestors as "fetch", but I think even "fetch" would have
the same problem.

If we know that the other side has commit X, we know that it also has
X~3, and we also know that it has every tree and blob mentioned by X~3.
But it's much too expensive to open up every tree to generate the full
set of reachable objects; for the Linux kernel, that is something like 45
seconds of CPU time, just to find out "oh, we only need to send 5
objects".

This works pretty well in practice, because trees and blobs from older
history don't tend to resurface verbatim. But as you noticed, there are
certain cases where it does happen, and the number of objects affected
can be quite large (to the point that sending the extra objects is much
more expensive than the cost of doing the extra tree traversal).
Unfortunately there is no "look harder" option you can give to
"git push" when you, as the user, realize this is happening.

If you have pack reachability bitmaps, they do produce a more thorough
answer. So probably:

  git repack -adb
  git push

on the client would make this work as you expect.

> Will it upload the larger pack only to have the server correctly handle the duplicates?

Yes, the receiving side should correctly handle the duplicates.

-Peff

      parent reply	other threads:[~2015-04-28  5:33 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-27  0:41 Question about how git determines the minimum packfile for a push Brad Litterell
2015-04-27  4:39 ` Junio C Hamano
2015-04-28  5:33 ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150428053332.GH24580@peff.net \
    --to=peff@peff.net \
    --cc=brad@evidence.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).