git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Cloning speed comparison
@ 2005-08-13  1:54 Petr Baudis
  2005-08-13  2:12 ` Linus Torvalds
  2005-08-15 17:50 ` Daniel Barkalow
  0 siblings, 2 replies; 11+ messages in thread
From: Petr Baudis @ 2005-08-13  1:54 UTC (permalink / raw)
  To: git

  Hello,

  I've wondered how slow the protocols other than rsync are, and the
(well, a bit dubious; especially wrt. caching on the remote side)
results are:

	git	clone-pack:ssh	25s
	git	rsync		27s
	git	http-pull	47s
	git	dumb-http	54s
	git	ssh-pull	660s

	cogito	clone-pack:ssh	35s (!)
	cogito	rsync		140s
	cogito	ssh-pull	480s
	cogito	http-pull	extrapolated to about an hour!
	cogito	dumb-http	N/A (missing info in the repository)

  (I didn't test the git server protocol, since kernel.org doesn't run
git server and I was too lazy to setup one.)

  The git repository contains one big pack, one small pack and few
standalone objects (5882 objects in total), while cogito is standalone
objects only (9670 objects in total, 8681 reachable).

  The numbers are off by some epsilons, as I didn't bother with multiple
measures, but shouldn't be hugely off for a general comparison. The
network connection has 2048kbit/s download, the other side was
www.kernel.org for HTTP and rsync, and master.kernel.org for ssh.

  Pulling from localhost (128M of RAM, 5M to 30M free - awful, yes):

	cogito	rsync:ssh	150s
	cogito	ssh-pull	120s (but didn't complete, see PS)
	cogito	http-pull	260s
	cogito	clone-pack:ssh	340s

  Anyway, clone-pack is a clear winner for networks (but someone should
re-check that, especially compared to rsync, wrt. server-side file
caching); really cool fast, but not very practical for anonymous access.
Any volunteers for a simple CGI (or gitweb addon) script + HTTP support
in clone-pack? HTTP is certainly the most suitable protocol for
anonymous pulls, so it's a shame it's still that sluggish.

  It is so slow here since it has some very ugly access pattern on the
objects database and my RAM is full so it does not get cached; even on
the servers, it was slower at first - unfortunately, I didn't measure
that, so what's in the top table are second accesses. Still, I would
expect the big repositories to stay mostly in the server cache, so this
isn't that big problem for those, I think.

  PS:
	With the latest git version as of time of writing this:
	$ time cg-clone git+ssh://pasky@localhost/home/pasky/WWW/dev/git/.g cogito
	...
	progress: 5759 objects, 10292457 bytes
	$ time cg-clone http://localhost/~pasky/dev/git/.g cogito
	...
	progress: 8681 objects, 14881571 bytes

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
If you want the holes in your knowledge showing up try teaching
someone.  -- Alan Cox

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-08-15 21:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-13  1:54 Cloning speed comparison Petr Baudis
2005-08-13  2:12 ` Linus Torvalds
2005-08-13  3:10   ` Petr Baudis
2005-08-13  3:28     ` Linus Torvalds
2005-08-13  5:16       ` H. Peter Anvin
2005-08-13  5:25         ` Linus Torvalds
2005-08-13 23:25           ` H. Peter Anvin
2005-08-13  5:16     ` H. Peter Anvin
2005-08-15 17:50 ` Daniel Barkalow
2005-08-15 20:46   ` Junio C Hamano
2005-08-15 21:27     ` Daniel Barkalow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).