Git development
 help / color / mirror / Atom feed
From: Junio C Hamano <junkio@cox.net>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: git@vger.kernel.org, Ralf Baechle <ralf@linux-mips.org>,
	Linus Torvalds <torvalds@osdl.org>
Subject: [RFH] further upload-pack/fetch-pack tweaks
Date: Thu, 06 Jul 2006 01:43:18 -0700	[thread overview]
Message-ID: <7vejwz6s49.fsf@assigned-by-dhcp.cox.net> (raw)

I was reviewing this issue and have an updated attempt to solve
the issue slightly differently.  I think I have something
working but would like to borrow extra sets of eyeballs.

    From: Junio C Hamano <junkio@cox.net>
    Subject: [PATCH/RFC] upload-pack: stop "ack continue" when we know common commits for wanted refs
    To: Ralf Baechle <ralf@linux-mips.org>
    cc: git@vger.kernel.org, Linus Torvalds <torvalds@osdl.org>
    Date: Fri, 26 May 2006 19:20:54 -0700
    Message-ID: <7vfyiwi4xl.fsf@assigned-by-dhcp.cox.net>

    When the downloader's repository has more roots than the server
    side has, the "have" exchange to figure out recent common
    commits ends up traversing the whole history of branches that
    only exist on the downloader's side.  When the downloader is
    asking for newer commits on the branch that exists on both ends,
    this is totally unnecessary.

    This adds logic to the server side to see if the wanted refs can
    reach the "have" commits received so far, and stop issuing "ack
    continue" once all of them can be reached from "have" commits.

The idea in the new implementation is to notice that the
downloader sent "have" for an object we do not know about, and
when we already have some "have" from them and some "want" are
still not known if they are already reachable from these
"have"s, we traverse the commit ancestry down to oldest "have"s
so far (this is just a heuristic) to see if all of "want" have
some common ancestor with the other side.  When we know all
"want" can be reachable by some "have" we have seen so far, we
send "ACK continue" when the downloader sends a "have" that we
do not have, to cause the downloader to stop traversing that
futile branch which leads to the root we do not have.  The code
sits near the tip of "pu".

I've started from a clone of git.git repository and tried to
fetch "todo" branch from another clone that does not have
anything but the "todo" branch.  So the downloader has five
extra roots (one for git.git itself, one for gitk, one for
gitweb, and one each for htmldocs and manpages).

        # upstream is just "todo" branch and nothing else
        git clone -n git.git upstream
        cd upstream
        mv .git/refs trash
        mkdir -p .git/refs/heads .git/refs/tags
        echo 'ref: refs/heads/master' >.git/HEAD
        cat trash/heads/todo >.git/refs/heads/master
        git repack -a -d
        cd ..

        # downloader has up-to-date git.git but stale "todo"
	git clone -n git.git downloader
        cd downloader
        git checkout todo
        git reset --hard HEAD~30
        git repack -a -d

        # try downloading things from upstream
        git fetch-pack -k -v ../upstream master 2>/var/tmp/new.out
	git fetch-pack -k -v --exec=old-git-upload-pack \
        	../upstream master 2>/var/tmp/old.out


It does send smaller number of "have"s than the current code,
but I noticed that near the end of transfer, after it gets an
"ACK continue" for a common commit on "todo" branch and an "ACK
continue" for a not-common commit on "master" branch, it keeps
sending the commits that are marked on the fetch-pack side as
COMMON_REF (so the last ref sent is v0.99^0 commit), although
upload-pack has told the downloader that whatever is reachable
from "master" branch are commits both sides agreed are common,
so I suspect it should not go down that path that far to reach
v0.99^0 commit.

I have a feeling that either get_rev() or mark_common() logic is
not marking ancestors of commit that are known to be common
properly.  Does this ring a bell?

                 reply	other threads:[~2006-07-06  8:43 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vejwz6s49.fsf@assigned-by-dhcp.cox.net \
    --to=junkio@cox.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=ralf@linux-mips.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox