From: Petr Baudis <pasky@suse.cz>
To: git@vger.kernel.org
Subject: Cloning speed comparison
Date: Sat, 13 Aug 2005 03:54:02 +0200 [thread overview]
Message-ID: <20050813015402.GC20812@pasky.ji.cz> (raw)
Hello,
I've wondered how slow the protocols other than rsync are, and the
(well, a bit dubious; especially wrt. caching on the remote side)
results are:
git clone-pack:ssh 25s
git rsync 27s
git http-pull 47s
git dumb-http 54s
git ssh-pull 660s
cogito clone-pack:ssh 35s (!)
cogito rsync 140s
cogito ssh-pull 480s
cogito http-pull extrapolated to about an hour!
cogito dumb-http N/A (missing info in the repository)
(I didn't test the git server protocol, since kernel.org doesn't run
git server and I was too lazy to setup one.)
The git repository contains one big pack, one small pack and few
standalone objects (5882 objects in total), while cogito is standalone
objects only (9670 objects in total, 8681 reachable).
The numbers are off by some epsilons, as I didn't bother with multiple
measures, but shouldn't be hugely off for a general comparison. The
network connection has 2048kbit/s download, the other side was
www.kernel.org for HTTP and rsync, and master.kernel.org for ssh.
Pulling from localhost (128M of RAM, 5M to 30M free - awful, yes):
cogito rsync:ssh 150s
cogito ssh-pull 120s (but didn't complete, see PS)
cogito http-pull 260s
cogito clone-pack:ssh 340s
Anyway, clone-pack is a clear winner for networks (but someone should
re-check that, especially compared to rsync, wrt. server-side file
caching); really cool fast, but not very practical for anonymous access.
Any volunteers for a simple CGI (or gitweb addon) script + HTTP support
in clone-pack? HTTP is certainly the most suitable protocol for
anonymous pulls, so it's a shame it's still that sluggish.
It is so slow here since it has some very ugly access pattern on the
objects database and my RAM is full so it does not get cached; even on
the servers, it was slower at first - unfortunately, I didn't measure
that, so what's in the top table are second accesses. Still, I would
expect the big repositories to stay mostly in the server cache, so this
isn't that big problem for those, I think.
PS:
With the latest git version as of time of writing this:
$ time cg-clone git+ssh://pasky@localhost/home/pasky/WWW/dev/git/.g cogito
...
progress: 5759 objects, 10292457 bytes
$ time cg-clone http://localhost/~pasky/dev/git/.g cogito
...
progress: 8681 objects, 14881571 bytes
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
If you want the holes in your knowledge showing up try teaching
someone. -- Alan Cox
next reply other threads:[~2005-08-13 1:54 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-08-13 1:54 Petr Baudis [this message]
2005-08-13 2:12 ` Cloning speed comparison Linus Torvalds
2005-08-13 3:10 ` Petr Baudis
2005-08-13 3:28 ` Linus Torvalds
2005-08-13 5:16 ` H. Peter Anvin
2005-08-13 5:25 ` Linus Torvalds
2005-08-13 23:25 ` H. Peter Anvin
2005-08-13 5:16 ` H. Peter Anvin
2005-08-15 17:50 ` Daniel Barkalow
2005-08-15 20:46 ` Junio C Hamano
2005-08-15 21:27 ` Daniel Barkalow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050813015402.GC20812@pasky.ji.cz \
--to=pasky@suse.cz \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).