* O(#haves*...) behaviour in "have <sha>" processing in upload-pack
@ 2008-09-13 0:11 Thomas Rast
2008-09-13 12:18 ` Thomas Rast
0 siblings, 1 reply; 2+ messages in thread
From: Thomas Rast @ 2008-09-13 0:11 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 2899 bytes --]
Hi *
evilchelu (Cristi Balan) pointed out on IRC that 'git fetch' takes a
long time when fetching a history-disjoint repository.
For example,
mkdir test && cd test
git init
git remote add -f paperclip git://github.com/thoughtbot/paperclip.git
git remote add -f hoptoad git://github.com/thoughtbot/hoptoad_notifier.git
git remote add -f aasm git://github.com/rubyist/aasm.git
git remote add -f forgot git://github.com/greenisus/forgot_password.git
git remote add -f restful git://github.com/technoweenie/restful-authentication.git
(taken straight from evilchelu's example, but it should be the same
with random repositories).
Peeking at the transmission with Wireshark, there is a noticeable
pattern of 4-5s delays before _every_ "0008NAK\n" line sent by the
server.
Looking at upload-pack.c, it seems that the server does far too much
work when processing the "have" lines. I've only just read into this
area of code, but the rough idea seems to be:
[404: get_common_commits()]
for (H = every "have" line) {
[321: got_sha1()]
flag H and its parents (shallow!) as THEY_HAVE
[415: get_common_commits()]
if (we do not have H in our object store) {
[367: ok_to_give_up()]
for (W = every "want" object specified earlier) {
[338: reachable()]
walk the entire history to see if anything flagged
THEY_HAVE so far is reachable from W
}
[418: get_common_commits()]
if the innermost test succeeded for all W: ACK this H so the
client stops walking history from it
}
}
The entire loop seems to have O(h*w*n) (n=history) complexity, which
probably is to blame for the delays.
* Isn't this ok_to_give_up() test moot? If H is not in our object
store, it cannot be of any use in the transfer (of our history to
the client). So if we are going to fake an ACK to stop the client
digging on this side of his history, we might as well send it right
away. (And what does reaching a THEY_HAVE commit from all W's have
to do with it?)
* Even assuming it is not, it would save some work if the server
avoided walking the entire history for every H. For example, it
could buffer up all H's until a "0000\n" arrives, which currently
seems to be 32 haves, then check all of them in a single pass over
history. (Unfortunately the 'h' factor cannot be removed completely
unless we buffer all of them, which again defeats the point.)
Much of the code in question was added in 937a515 (upload-pack: stop
the other side when they have more roots than we do., 2006-07-05).
[I hope I'm making some sense, it's far too late here, but it was
either this or trying to understand it again in the morning.]
- Thomas
--
Thomas Rast
trast@student.ethz.ch
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: O(#haves*...) behaviour in "have <sha>" processing in upload-pack
2008-09-13 0:11 O(#haves*...) behaviour in "have <sha>" processing in upload-pack Thomas Rast
@ 2008-09-13 12:18 ` Thomas Rast
0 siblings, 0 replies; 2+ messages in thread
From: Thomas Rast @ 2008-09-13 12:18 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 811 bytes --]
I wrote:
> * Isn't this ok_to_give_up() test moot? If H is not in our object
> store, it cannot be of any use in the transfer (of our history to
> the client). So if we are going to fake an ACK to stop the client
> digging on this side of his history, we might as well send it right
> away.
Never mind this part, it's wrong:
Let B=$(merge-base H W). Suppose H is unknown to the server, but B is
known. Then sending a fake ACK as a reply to H will cause
* the client to believe we have everything reachable from H, including
B, and cease sending any history reachable from H; and thus
* the server to believe that the client does not have B, since it did
not list this commit in a "have" line.
Sorry for the noise.
- Thomas
--
Thomas Rast
trast@student.ethz.ch
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2008-09-13 12:26 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-13 0:11 O(#haves*...) behaviour in "have <sha>" processing in upload-pack Thomas Rast
2008-09-13 12:18 ` Thomas Rast
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).