git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sam Vilain <sam@vilain.net>
To: Nicolas Pitre <nico@cam.org>
Cc: Jakub Narebski <jnareb@gmail.com>,
	Tomasz Kontusz <roverorna@gmail.com>, git <git@vger.kernel.org>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Scott Chacon <schacon@gmail.com>
Subject: Re: Continue git clone after interruption
Date: Sun, 23 Aug 2009 22:37:09 +1200	[thread overview]
Message-ID: <1251023829.8115.23.camel@maia.lan> (raw)
In-Reply-To: <alpine.LFD.2.00.0908220155240.6044@xanadu.home>

On Sat, 2009-08-22 at 04:13 -0400, Nicolas Pitre wrote:
> > Ok, but right now there's no way to specify that you want a thin pack,
> > where the allowable base objects are *newer* than the commit range you
> > wish to include.
> 
> Sure you can.  Try this:
> 
> 	( echo "-$(git rev-parse v1.6.4)"; \
> 	  git rev-list --objects v1.6.2..v1.6.3 ) | \
> 		git pack-objects --progress --stdout > foo.pack
> 
> That'll give you a thin pack for the _new_ objects that _appeared_ 
> between v1.6.2 and v1.6.3, but which external delta base objects are 
> found in v1.6.4.

Aha.  I guess I had made an assumption about where that '-' lets
pack-objects find deltas from that aren't true.

> > What I said in my other e-mail where I showed how well it works taking
> > a given bundle, and slicing it into a series of thin packs, was that it
> > seems to add a bit of extra size to the resultant packs - best I got for
> > slicing up the entire git.git run was about 20%.  If this can be
> > reduced to under 10% (say), then sending bundle slices would be quite
> > reasonable by default for the benefit of making large fetches
> > restartable, or even spreadable across multiple mirrors.
> 
> In theory you could have about no overhead.  That all depends how you 
> slice the pack.  If you want a pack to contain a fixed number of commits 
> (such that all objects introduced by a given commit are all in the same 
> pack) then you are of course putting a constraint on the possible delta 
> matches and compression result might be suboptimal.  In comparison, with 
> a single big pack a given blob can delta against a blob from a 
> completely distant commit in the history graph if that provides a better 
> compression ratio.
 [...]
> If you were envisioning _clients_ à la BitTorrent putting up pack slices 
> instead, then in that case the slices have to be well defined entities, 
> like packs containing objects for known range of commits, but then we're 
> back to the delta inefficiency I mentioned above.

I'll do some more experiments to try to quantify this in light of this
new information; I still think that if the overhead is marginal there
are significant wins to this approach.

> And again this might 
> work only if a lot of people are interested in the same repository at 
> the same time, and of course most people have no big insentive to "seed" 
> once they got their copy. So I'm not sure if that might work that well 
> in practice.

Throw away terms like "seeding" and replace with "mirroring".  Sites
which currently house mirrors could potentially be helping serve git
repos, too.  Popular projects could have many mirrors and on the edges
of the internet, git servers could mirror many projects for users in
their country.

Sam

  reply	other threads:[~2009-08-23 10:34 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-17 11:42 Continue git clone after interruption Tomasz Kontusz
2009-08-17 12:31 ` Johannes Schindelin
2009-08-17 15:23   ` Shawn O. Pearce
2009-08-18  5:43   ` Matthieu Moy
2009-08-18  6:58     ` Tomasz Kontusz
2009-08-18 17:56       ` Nicolas Pitre
2009-08-18 18:45         ` Jakub Narebski
2009-08-18 20:01           ` Nicolas Pitre
2009-08-18 21:02             ` Jakub Narebski
2009-08-18 21:32               ` Nicolas Pitre
2009-08-19 15:19                 ` Jakub Narebski
2009-08-19 19:04                   ` Nicolas Pitre
2009-08-19 19:42                     ` Jakub Narebski
2009-08-19 21:13                       ` Nicolas Pitre
2009-08-20  0:26                         ` Sam Vilain
2009-08-20  7:37                         ` Jakub Narebski
2009-08-20  7:48                           ` Nguyen Thai Ngoc Duy
2009-08-20  8:23                             ` Jakub Narebski
2009-08-20 18:41                           ` Nicolas Pitre
2009-08-21 10:07                             ` Jakub Narebski
2009-08-21 10:26                               ` Matthieu Moy
2009-08-21 21:07                               ` Nicolas Pitre
2009-08-21 21:41                                 ` Jakub Narebski
2009-08-22  0:59                                   ` Nicolas Pitre
2009-08-21 23:07                                 ` Sam Vilain
2009-08-22  3:37                                   ` Nicolas Pitre
2009-08-22  5:50                                     ` Sam Vilain
2009-08-22  8:13                                       ` Nicolas Pitre
2009-08-23 10:37                                         ` Sam Vilain [this message]
2009-08-20 22:57                           ` Sam Vilain
2009-08-18 22:28             ` Johannes Schindelin
2009-08-18 23:40               ` Nicolas Pitre
2009-08-19  7:35                 ` Johannes Schindelin
2009-08-19  8:25                   ` Nguyen Thai Ngoc Duy
2009-08-19  9:52                     ` Johannes Schindelin
2009-08-19 17:21                   ` Nicolas Pitre
2009-08-19 22:23                     ` René Scharfe
2009-08-19  4:42           ` Sitaram Chamarty
2009-08-19  9:53             ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1251023829.8115.23.camel@maia.lan \
    --to=sam@vilain.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=nico@cam.org \
    --cc=roverorna@gmail.com \
    --cc=schacon@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).