git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sitaram Chamarty <sitaramc@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Storm-Olsen\, Marius" <Marius.Storm-Olsen@student.bi.no>,
	"git@vger.kernel.org" <git@vger.kernel.org>,
	milki <milki@rescomp.berkeley.edu>
Subject: Re: optimising a push by fetching objects from nearby repos
Date: Mon, 12 May 2014 07:20:59 +0530	[thread overview]
Message-ID: <53702903.20904@gmail.com> (raw)
In-Reply-To: <xmqqbnv4ur7t.fsf@gitster.dls.corp.google.com>

On 05/11/2014 11:34 PM, Junio C Hamano wrote:
> Sitaram Chamarty <sitaramc@gmail.com> writes:
>
>> But what I was looking for was validation from git.git folks of the idea
>> of replicating what "git clone -l" does, for an *existing* repo.
>>
>> For example, I'm assuming that bringing in only the objects -- without
>> any of the refs pointing to them, making them all dangling objects --
>> will still allow the optimisation to occur (i.e., git will still say "oh
>> yeah I have these objects, even if they're dangling so I won't ask for
>> them from the pusher" and not "oh these are dangling objects; so I don't
>> recognise them from this perspective -- you'll have to send me those
>> again").
>
> So here is an educated guess by a git.git folk.  I haven't read the
> codepath for some time, so I may be missing some details:
>
>   - The set of objects sent over the wire in "push" direction is
>     determined by the receiving end listing what it has to the
>     sending end, and then the sending end excluding what the
>     receiving end told that it already has.
>
>   - The receiving end tells the sending end what it has by showing
>     the names of its refs and their values.
>
> Having otherwise dangling objects in your object store alone will
> not make them reachable from the refs shown to the sending end.  But
> there is another trick the receiving end employes.
>
>   - The receiving end also includes the refs and their values that
>     appear in the repository it borrows objects from its alternate
>     repositories, when it tells what objects it already has to the
>     sending end.
>
> So what you "assumed" is not entirely correct---bringing in only the
> objects will not give you any optimization.
>
> But because we infer from the location of the object store
> (i.e. "objects" directory) where the refs that point at these
> borrowed objects exist (i.e. in "../refs" relative to that "objects"
> directory) in order to make sure that we do not have to say "oh
> these are dangling but we know their history is not broken", we
> still get the same optimisation.

Thanks!

Everything makes sense.  However, I'm not using the alternates
mechanism.

Since gitolite has the advantage of allowing me to do something before
and something after the git-receive-pack, I'm fetching all the refs into
a temporary namespace before, and deleting all of them after.  So, just
for the duration of the push, the refs do exist, and optimisation (of
network traffic) therefore happens.

In addition, since I check that the user has read access to the lender
repo (and don't do this optimisation if he does not), there is -- by
definition -- no security issue, in the sense that he cannot get
anything from the lender repo that he could not have got directly.

Thanks for all your help again, especially the very clear explanation!

regards
sitaram

      reply	other threads:[~2014-05-12  1:51 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-10 13:39 optimising a push by fetching objects from nearby repos Sitaram Chamarty
2014-05-10 13:54 ` Duy Nguyen
2014-05-10 17:23 ` brian m. carlson
2014-05-10 17:32   ` milki
2014-05-10 20:04     ` brian m. carlson
2014-05-10 21:02 ` Junio C Hamano
2014-05-11  1:04   ` Sitaram Chamarty
2014-05-11  1:34     ` Storm-Olsen, Marius
2014-05-11  2:10       ` Sitaram Chamarty
2014-05-11  3:11         ` Storm-Olsen, Marius
2014-05-11  5:21           ` Sitaram Chamarty
2014-05-11 18:04             ` Junio C Hamano
2014-05-12  1:50               ` Sitaram Chamarty [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53702903.20904@gmail.com \
    --to=sitaramc@gmail.com \
    --cc=Marius.Storm-Olsen@student.bi.no \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=milki@rescomp.berkeley.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).