All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matt Glazar <strager@fb.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: git-fetch pulls already-pulled objects?
Date: Thu, 29 Oct 2015 19:52:27 +0000	[thread overview]
Message-ID: <D257C4CB.1378A%strager@fb.com> (raw)
In-Reply-To: <xmqqwpu5qzxs.fsf@gitster.mtv.corp.google.com>

> I forgot to mention the recent "pack bitmap" addition.  It makes the
> set of "can be cheaply proven to exist" a lot larger.

Cool! I tried this feature, and it worked! (At least, it worked for my
small test case.)

I ran on the server (after pushing the objects):

git config repack.writeBitmaps true
git repack -Ad

After this, the 'git fetch origin master2' was super quick.

Thanks for your help!

Aside: This test case is using (normal, C/sh) Git. My production
environment uses JGit on the server. I haven't tested this with JGit.

-----Original Message-----
From: Junio C Hamano <gitster@pobox.com>
Date: Thursday, October 29, 2015 at 11:42 AM
To: Matt Glazer <strager@fb.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: git-fetch pulls already-pulled objects?

>Matt Glazar <strager@fb.com> writes:
>
>> Would negotiating the tree object hashes be possible on the client
>>without
>> server changes? Is the protocol that flexible?
>
>The protocol is strictly "find common ancestor in the commit
>history".  Everything else is done on the sender.
>
>>>The object transfer is done by first finding the common ancestor of
>>>histories of the sending and the receiving sides, which allows the
>>>sender to enumerate commits that the sender has but the receiver
>>>doesn't.  From there, all objects [*1*] that are referenced by these
>>>commits that need to be sent.
>
>>>[Footnote]
>>>
>>>*1* There is an optimization to exclude the trees and blobs that can
>>>be cheaply proven to exist on the receiving end.  If the receiving
>>>end has a commit that the sending end does *not* have, and that
>>>commit happens to record a tree the sending end needs to send,
>>>however, the sending end cannot prove that the tree does not have to
>>>be sent without first fetching that commit from the receiving end,
>>>which fails "can be cheaply proven to exist" test.
>
>I forgot to mention the recent "pack bitmap" addition.  It makes the
>set of "can be cheaply proven to exist" a lot larger.
>
>If for example the sender needs to send one commit C because it
>determined that the receiver has history up to commit C~1, without
>the bitmap, even when C^{tree} (i.e. the tree of C) is identical to
>C~2^{tree} (i.e. the tree of C~2), it would have sent that tree
>object because "proving that the receiver already has it" would
>require the sender to dig its history back, starting from C~1
>(i.e. the commit that is known to exist at the receiver), to
>enumerate the objects contained in the common part of the history,
>which fails the "can be cheaply proven to exist" test.
>
>The "pack bitmap" pre-computes what commits, trees and blobs should
>already exist in the repository given a commit for which bitmap
>exists.  Using the bitmap, from C~1 (i.e. the commit known to exist
>at the receiving end), it can be proven cheaply that C^{tree} that
>happens to be identical to C~2^{tree} already exists over there, and
>the sender can use this knowledge to reduce the transfer.
>
>The "pack bitmap" however does not change the fundamental structure.
>If your receiver has a commit that is not known to the sender, and
>that commit happens to record the same tree recorded in the commit
>that needs to be sent, there is no way for the sender to know that
>the receiver has it, exactly because the exchange between them is
>purely "find common ancestor in history".


      reply	other threads:[~2015-10-29 19:52 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-28 23:28 git-fetch pulls already-pulled objects? Matt Glazar
2015-10-29 17:32 ` Junio C Hamano
2015-10-29 18:08   ` Matt Glazar
2015-10-29 18:42     ` Junio C Hamano
2015-10-29 19:52       ` Matt Glazar [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D257C4CB.1378A%strager@fb.com \
    --to=strager@fb.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.