From: Matt Glazar <strager@fb.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: git-fetch pulls already-pulled objects?
Date: Thu, 29 Oct 2015 19:52:27 +0000 [thread overview]
Message-ID: <D257C4CB.1378A%strager@fb.com> (raw)
In-Reply-To: <xmqqwpu5qzxs.fsf@gitster.mtv.corp.google.com>
> I forgot to mention the recent "pack bitmap" addition. It makes the
> set of "can be cheaply proven to exist" a lot larger.
Cool! I tried this feature, and it worked! (At least, it worked for my
small test case.)
I ran on the server (after pushing the objects):
git config repack.writeBitmaps true
git repack -Ad
After this, the 'git fetch origin master2' was super quick.
Thanks for your help!
Aside: This test case is using (normal, C/sh) Git. My production
environment uses JGit on the server. I haven't tested this with JGit.
-----Original Message-----
From: Junio C Hamano <gitster@pobox.com>
Date: Thursday, October 29, 2015 at 11:42 AM
To: Matt Glazer <strager@fb.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: git-fetch pulls already-pulled objects?
>Matt Glazar <strager@fb.com> writes:
>
>> Would negotiating the tree object hashes be possible on the client
>>without
>> server changes? Is the protocol that flexible?
>
>The protocol is strictly "find common ancestor in the commit
>history". Everything else is done on the sender.
>
>>>The object transfer is done by first finding the common ancestor of
>>>histories of the sending and the receiving sides, which allows the
>>>sender to enumerate commits that the sender has but the receiver
>>>doesn't. From there, all objects [*1*] that are referenced by these
>>>commits that need to be sent.
>
>>>[Footnote]
>>>
>>>*1* There is an optimization to exclude the trees and blobs that can
>>>be cheaply proven to exist on the receiving end. If the receiving
>>>end has a commit that the sending end does *not* have, and that
>>>commit happens to record a tree the sending end needs to send,
>>>however, the sending end cannot prove that the tree does not have to
>>>be sent without first fetching that commit from the receiving end,
>>>which fails "can be cheaply proven to exist" test.
>
>I forgot to mention the recent "pack bitmap" addition. It makes the
>set of "can be cheaply proven to exist" a lot larger.
>
>If for example the sender needs to send one commit C because it
>determined that the receiver has history up to commit C~1, without
>the bitmap, even when C^{tree} (i.e. the tree of C) is identical to
>C~2^{tree} (i.e. the tree of C~2), it would have sent that tree
>object because "proving that the receiver already has it" would
>require the sender to dig its history back, starting from C~1
>(i.e. the commit that is known to exist at the receiver), to
>enumerate the objects contained in the common part of the history,
>which fails the "can be cheaply proven to exist" test.
>
>The "pack bitmap" pre-computes what commits, trees and blobs should
>already exist in the repository given a commit for which bitmap
>exists. Using the bitmap, from C~1 (i.e. the commit known to exist
>at the receiving end), it can be proven cheaply that C^{tree} that
>happens to be identical to C~2^{tree} already exists over there, and
>the sender can use this knowledge to reduce the transfer.
>
>The "pack bitmap" however does not change the fundamental structure.
>If your receiver has a commit that is not known to the sender, and
>that commit happens to record the same tree recorded in the commit
>that needs to be sent, there is no way for the sender to know that
>the receiver has it, exactly because the exchange between them is
>purely "find common ancestor in history".
prev parent reply other threads:[~2015-10-29 19:52 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-28 23:28 git-fetch pulls already-pulled objects? Matt Glazar
2015-10-29 17:32 ` Junio C Hamano
2015-10-29 18:08 ` Matt Glazar
2015-10-29 18:42 ` Junio C Hamano
2015-10-29 19:52 ` Matt Glazar [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D257C4CB.1378A%strager@fb.com \
--to=strager@fb.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).