From: Junio C Hamano <junkio@cox.net>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: git@vger.kernel.org, Ralf Baechle <ralf@linux-mips.org>,
Linus Torvalds <torvalds@osdl.org>
Subject: [RFH] further upload-pack/fetch-pack tweaks
Date: Thu, 06 Jul 2006 01:43:18 -0700 [thread overview]
Message-ID: <7vejwz6s49.fsf@assigned-by-dhcp.cox.net> (raw)
I was reviewing this issue and have an updated attempt to solve
the issue slightly differently. I think I have something
working but would like to borrow extra sets of eyeballs.
From: Junio C Hamano <junkio@cox.net>
Subject: [PATCH/RFC] upload-pack: stop "ack continue" when we know common commits for wanted refs
To: Ralf Baechle <ralf@linux-mips.org>
cc: git@vger.kernel.org, Linus Torvalds <torvalds@osdl.org>
Date: Fri, 26 May 2006 19:20:54 -0700
Message-ID: <7vfyiwi4xl.fsf@assigned-by-dhcp.cox.net>
When the downloader's repository has more roots than the server
side has, the "have" exchange to figure out recent common
commits ends up traversing the whole history of branches that
only exist on the downloader's side. When the downloader is
asking for newer commits on the branch that exists on both ends,
this is totally unnecessary.
This adds logic to the server side to see if the wanted refs can
reach the "have" commits received so far, and stop issuing "ack
continue" once all of them can be reached from "have" commits.
The idea in the new implementation is to notice that the
downloader sent "have" for an object we do not know about, and
when we already have some "have" from them and some "want" are
still not known if they are already reachable from these
"have"s, we traverse the commit ancestry down to oldest "have"s
so far (this is just a heuristic) to see if all of "want" have
some common ancestor with the other side. When we know all
"want" can be reachable by some "have" we have seen so far, we
send "ACK continue" when the downloader sends a "have" that we
do not have, to cause the downloader to stop traversing that
futile branch which leads to the root we do not have. The code
sits near the tip of "pu".
I've started from a clone of git.git repository and tried to
fetch "todo" branch from another clone that does not have
anything but the "todo" branch. So the downloader has five
extra roots (one for git.git itself, one for gitk, one for
gitweb, and one each for htmldocs and manpages).
# upstream is just "todo" branch and nothing else
git clone -n git.git upstream
cd upstream
mv .git/refs trash
mkdir -p .git/refs/heads .git/refs/tags
echo 'ref: refs/heads/master' >.git/HEAD
cat trash/heads/todo >.git/refs/heads/master
git repack -a -d
cd ..
# downloader has up-to-date git.git but stale "todo"
git clone -n git.git downloader
cd downloader
git checkout todo
git reset --hard HEAD~30
git repack -a -d
# try downloading things from upstream
git fetch-pack -k -v ../upstream master 2>/var/tmp/new.out
git fetch-pack -k -v --exec=old-git-upload-pack \
../upstream master 2>/var/tmp/old.out
It does send smaller number of "have"s than the current code,
but I noticed that near the end of transfer, after it gets an
"ACK continue" for a common commit on "todo" branch and an "ACK
continue" for a not-common commit on "master" branch, it keeps
sending the commits that are marked on the fetch-pack side as
COMMON_REF (so the last ref sent is v0.99^0 commit), although
upload-pack has told the downloader that whatever is reachable
from "master" branch are commits both sides agreed are common,
so I suspect it should not go down that path that far to reach
v0.99^0 commit.
I have a feeling that either get_rev() or mark_common() logic is
not marking ancestors of commit that are known to be common
properly. Does this ring a bell?
reply other threads:[~2006-07-06 8:43 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vejwz6s49.fsf@assigned-by-dhcp.cox.net \
--to=junkio@cox.net \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=ralf@linux-mips.org \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox