From: Pete Wyckoff <pw@padd.com>
To: Johannes Sixt <j.sixt@viscovery.net>
Cc: git@vger.kernel.org
Subject: Re: refs/replace advice
Date: Fri, 29 Jul 2011 18:46:33 -0400 [thread overview]
Message-ID: <20110729224633.GA21355@arf.padd.com> (raw)
In-Reply-To: <4E32D6A1.8020304@viscovery.net>
j.sixt@viscovery.net wrote on Fri, 29 Jul 2011 17:49 +0200:
> Am 7/29/2011 17:31, schrieb Pete Wyckoff:
> > I'm trying to use "git replace" to avoid cloning the entire set
> > of duplicate commits across a slow inter-site link. Like this:
> >
> > ...---A----B----C site1/top
> > \
> > D---E---F site1/proj
> >
> > ...---A'---B'---C' site2/top
> >
> > It is true that "git diff C C'" is empty: they are identical.
> ...
> > I thought maybe I could "git fetch --depth=N" where N would cover
> > the range A'..site2/top, then replace. But testing with "git
> > fetch --depth=3" still wants to fetch 100k objects.
>
> On site2, don't you want to 'git fetch --depth=N site1' such that F down
> to at least C (but not much more) is fetched, and then apply the graft or
> replacement on site2?
Yes, that makes sense, shallow clone needs to pull the entire tree.
On site1 (bare .git repo):
$ du -sm .
542 .
$ git merge-base site1/proj site1/top
ff016f956ccae7878a1b322ba950a0088c6e2ded ;# this is A
$ git rev-list ff016f956ccae7878a1b322ba950a0088c6e2ded | wc
566 566 23206
On site2:
$ du -sm .git
649 .git
$ git rev-parse :/1384557
0f95d91c37bc870d610b7bd45b316ab219750d31 ;# this is A'
$ git rev-list 0f95d91c37bc870d610b7bd45b316ab219750d31 | wc
566 566 23206
Same number of commits all the way back to the beginning of time,
but the timestamp in the root commit is different, so all the SHA1s
are different.
On site2:
$ time git fetch git://site1/repo
warning: no common commits
remote: Counting objects: 124166, done.
remote: Compressing objects: 100% (64472/64472), done.
remote: Total 124166 (delta 59815), reused 121350 (delta 57062)
Receiving objects: 100% (124166/124166), 462.31 MiB | 5.31 MiB/s, done.
Resolving deltas: 100% (59815/59815), done.
From git://site1/repo
* branch HEAD -> FETCH_HEAD
0m56.25s user 0m5.18s sys 2m29.45s elapsed 41.11 %CPU
A brand new repo on site2, cloning this time with a teensy depth:
$ time git fetch --depth=3 git://site1/repo
warning: no common commits
remote: Counting objects: 96440, done.
remote: Compressing objects: 100% (58844/58844), done.
remote: Total 96440 (delta 36454), reused 92650 (delta 35169)
Receiving objects: 100% (96440/96440), 415.87 MiB | 7.38 MiB/s, done.
Resolving deltas: 100% (36454/36454), done.
From git://site1/repo
* branch HEAD -> FETCH_HEAD
0m40.40s user 0m5.27s sys 1m50.29s elapsed 41.41 %CPU
No savings in data transport.
Was hoping it would be possible to get just the changes, but walking
back to FETCH_HEAD~3 shows that it imports all the files. That makes
sense given the use case for shallow clone. But I want to tell the
fetch machinery that I already have one of the commits it is going to
see.
I'll just tell people to put up with the full copy, and try to fix
things so that only one site creates the git repo from p4 in the future.
Thanks for looking,
-- Pete
next prev parent reply other threads:[~2011-07-29 22:46 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-29 15:31 refs/replace advice Pete Wyckoff
2011-07-29 15:49 ` Johannes Sixt
2011-07-29 22:46 ` Pete Wyckoff [this message]
2011-08-02 21:54 ` Pete Wyckoff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110729224633.GA21355@arf.padd.com \
--to=pw@padd.com \
--cc=git@vger.kernel.org \
--cc=j.sixt@viscovery.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.