* refs/replace advice
@ 2011-07-29 15:31 Pete Wyckoff
2011-07-29 15:49 ` Johannes Sixt
2011-08-02 21:54 ` Pete Wyckoff
0 siblings, 2 replies; 4+ messages in thread
From: Pete Wyckoff @ 2011-07-29 15:31 UTC (permalink / raw)
To: git
I've got two near-identical git repos, both imported from
gigantic upstream p4 repos. They started at slightly different
times so have different commit SHA1s, even though the tree
contents are the same. I can't filter-branch either of them; too
many users already.
I'm trying to use "git replace" to avoid cloning the entire set
of duplicate commits across a slow inter-site link. Like this:
...---A----B----C site1/top
\
D---E---F site1/proj
...---A'---B'---C' site2/top
It is true that "git diff C C'" is empty: they are identical.
This set of commands, run from site2, clones most of the repo
locally (up to C'), then grabs the few changes D..F from the
faraway site1:
git clone /path/to/site2.git repo
cd repo
git remote add -f site1 /path/to/faraway/site1.git
But it causes an entire fetch of all commits because C != C'.
I'd prefer it just to fetch D, E and F. So I try:
git refs replace A' A
but it still fetches everything. I toyed with grafting
site1's A on top of the parent of our local A':
echo A A'^ > .git/info/grafts
no luck.
I thought maybe I could "git fetch --depth=N" where N would cover
the range A'..site2/top, then replace. But testing with "git
fetch --depth=3" still wants to fetch 100k objects.
Any ideas?
-- Pete
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: refs/replace advice
2011-07-29 15:31 refs/replace advice Pete Wyckoff
@ 2011-07-29 15:49 ` Johannes Sixt
2011-07-29 22:46 ` Pete Wyckoff
2011-08-02 21:54 ` Pete Wyckoff
1 sibling, 1 reply; 4+ messages in thread
From: Johannes Sixt @ 2011-07-29 15:49 UTC (permalink / raw)
To: Pete Wyckoff; +Cc: git
Am 7/29/2011 17:31, schrieb Pete Wyckoff:
> I'm trying to use "git replace" to avoid cloning the entire set
> of duplicate commits across a slow inter-site link. Like this:
>
> ...---A----B----C site1/top
> \
> D---E---F site1/proj
>
> ...---A'---B'---C' site2/top
>
> It is true that "git diff C C'" is empty: they are identical.
...
> I thought maybe I could "git fetch --depth=N" where N would cover
> the range A'..site2/top, then replace. But testing with "git
> fetch --depth=3" still wants to fetch 100k objects.
On site2, don't you want to 'git fetch --depth=N site1' such that F down
to at least C (but not much more) is fetched, and then apply the graft or
replacement on site2?
-- Hannes
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: refs/replace advice
2011-07-29 15:49 ` Johannes Sixt
@ 2011-07-29 22:46 ` Pete Wyckoff
0 siblings, 0 replies; 4+ messages in thread
From: Pete Wyckoff @ 2011-07-29 22:46 UTC (permalink / raw)
To: Johannes Sixt; +Cc: git
j.sixt@viscovery.net wrote on Fri, 29 Jul 2011 17:49 +0200:
> Am 7/29/2011 17:31, schrieb Pete Wyckoff:
> > I'm trying to use "git replace" to avoid cloning the entire set
> > of duplicate commits across a slow inter-site link. Like this:
> >
> > ...---A----B----C site1/top
> > \
> > D---E---F site1/proj
> >
> > ...---A'---B'---C' site2/top
> >
> > It is true that "git diff C C'" is empty: they are identical.
> ...
> > I thought maybe I could "git fetch --depth=N" where N would cover
> > the range A'..site2/top, then replace. But testing with "git
> > fetch --depth=3" still wants to fetch 100k objects.
>
> On site2, don't you want to 'git fetch --depth=N site1' such that F down
> to at least C (but not much more) is fetched, and then apply the graft or
> replacement on site2?
Yes, that makes sense, shallow clone needs to pull the entire tree.
On site1 (bare .git repo):
$ du -sm .
542 .
$ git merge-base site1/proj site1/top
ff016f956ccae7878a1b322ba950a0088c6e2ded ;# this is A
$ git rev-list ff016f956ccae7878a1b322ba950a0088c6e2ded | wc
566 566 23206
On site2:
$ du -sm .git
649 .git
$ git rev-parse :/1384557
0f95d91c37bc870d610b7bd45b316ab219750d31 ;# this is A'
$ git rev-list 0f95d91c37bc870d610b7bd45b316ab219750d31 | wc
566 566 23206
Same number of commits all the way back to the beginning of time,
but the timestamp in the root commit is different, so all the SHA1s
are different.
On site2:
$ time git fetch git://site1/repo
warning: no common commits
remote: Counting objects: 124166, done.
remote: Compressing objects: 100% (64472/64472), done.
remote: Total 124166 (delta 59815), reused 121350 (delta 57062)
Receiving objects: 100% (124166/124166), 462.31 MiB | 5.31 MiB/s, done.
Resolving deltas: 100% (59815/59815), done.
From git://site1/repo
* branch HEAD -> FETCH_HEAD
0m56.25s user 0m5.18s sys 2m29.45s elapsed 41.11 %CPU
A brand new repo on site2, cloning this time with a teensy depth:
$ time git fetch --depth=3 git://site1/repo
warning: no common commits
remote: Counting objects: 96440, done.
remote: Compressing objects: 100% (58844/58844), done.
remote: Total 96440 (delta 36454), reused 92650 (delta 35169)
Receiving objects: 100% (96440/96440), 415.87 MiB | 7.38 MiB/s, done.
Resolving deltas: 100% (36454/36454), done.
From git://site1/repo
* branch HEAD -> FETCH_HEAD
0m40.40s user 0m5.27s sys 1m50.29s elapsed 41.41 %CPU
No savings in data transport.
Was hoping it would be possible to get just the changes, but walking
back to FETCH_HEAD~3 shows that it imports all the files. That makes
sense given the use case for shallow clone. But I want to tell the
fetch machinery that I already have one of the commits it is going to
see.
I'll just tell people to put up with the full copy, and try to fix
things so that only one site creates the git repo from p4 in the future.
Thanks for looking,
-- Pete
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: refs/replace advice
2011-07-29 15:31 refs/replace advice Pete Wyckoff
2011-07-29 15:49 ` Johannes Sixt
@ 2011-08-02 21:54 ` Pete Wyckoff
1 sibling, 0 replies; 4+ messages in thread
From: Pete Wyckoff @ 2011-08-02 21:54 UTC (permalink / raw)
To: git; +Cc: Johannes Sixt
pw@padd.com wrote on Fri, 29 Jul 2011 08:31 -0700:
> I've got two near-identical git repos, both imported from
> gigantic upstream p4 repos. They started at slightly different
> times so have different commit SHA1s, even though the tree
> contents are the same. I can't filter-branch either of them; too
> many users already.
>
> I'm trying to use "git replace" to avoid cloning the entire set
> of duplicate commits across a slow inter-site link. Like this:
To follow-up, I decided this was not the right approach.
Instead, for future git-p4 repo creation, I've patched git-p4
to make sure the timestamps and hence the initial SHA1s are
identical.
Thanks, Hannes, for your suggestion, though.
-- Pete
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-08-02 21:54 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-29 15:31 refs/replace advice Pete Wyckoff
2011-07-29 15:49 ` Johannes Sixt
2011-07-29 22:46 ` Pete Wyckoff
2011-08-02 21:54 ` Pete Wyckoff
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).