* Joining historical repository using grafts or replace
@ 2014-10-30 15:39 Dmitry Oksenchuk
2014-10-30 15:44 ` W. Trevor King
2014-10-30 16:54 ` Christian Couder
0 siblings, 2 replies; 8+ messages in thread
From: Dmitry Oksenchuk @ 2014-10-30 15:39 UTC (permalink / raw)
To: git
Hello,
We're in the middle of conversion of a large CVS repository (20 years,
70K commits, 1K branches, 10K tags) to Git and considering two
separate Git repositories: "historical" with CVS history and "working"
created without history from heads of active branches (10 active
branches). This allows us to have small fast "working" repository for
developers who don't want to have full history locally and ability to
rewrite history in "historical" repository (for example, to add
parents to merge commits or to fix conversion mistakes) without
affecting commit hashes in "working" repository (the hashes can be
stored in bug tracker or in the code).
The first idea was to use grafs to join branch roots in "working"
repository with branches in "historical" repository like in linux
repository but it seems that grafts are known as a "horrible hack" (
http://marc.info/?l=git&m=131127600030310&w=2
http://permalink.gmane.org/gmane.comp.version-control.git/177153 )
Since Git 1.6.5 "replace" can also be used to join the histories by
replacing branch roots in "working" repository with branch heads in
"historical" repository.
Both grafts and replace will be used locally. Grafts is a bit easier
to distribute (simple copying, replaces should be created via bash
script).
Are there any disadvantages of using grafts and replace? Will both of
them be supported in future versions of Git?
Thank you,
Dmitry
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Joining historical repository using grafts or replace
2014-10-30 15:39 Joining historical repository using grafts or replace Dmitry Oksenchuk
@ 2014-10-30 15:44 ` W. Trevor King
2014-10-30 17:56 ` Dmitry Oksenchuk
2014-10-30 16:54 ` Christian Couder
1 sibling, 1 reply; 8+ messages in thread
From: W. Trevor King @ 2014-10-30 15:44 UTC (permalink / raw)
To: Dmitry Oksenchuk; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 1168 bytes --]
On Thu, Oct 30, 2014 at 06:39:56PM +0300, Dmitry Oksenchuk wrote:
> We're in the middle of conversion of a large CVS repository (20
> years, 70K commits, 1K branches, 10K tags) to Git and considering
> two separate Git repositories: "historical" with CVS history and
> "working" created without history from heads of active branches (10
> active branches). This allows us to have small fast "working"
> repository for developers who don't want to have full history
> locally and ability to rewrite history in "historical" repository
> (for example, to add parents to merge commits or to fix conversion
> mistakes) without affecting commit hashes in "working" repository
> (the hashes can be stored in bug tracker or in the code).
A number of projects have done something like this (e.g. Linux).
Modern Gits have good support for shallow repositories though, so I'd
just make one full repository and leave it to developers to decide how
deep they want their local copy to be.
Cheers,
Trevor
--
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Joining historical repository using grafts or replace
2014-10-30 15:44 ` W. Trevor King
@ 2014-10-30 17:56 ` Dmitry Oksenchuk
0 siblings, 0 replies; 8+ messages in thread
From: Dmitry Oksenchuk @ 2014-10-30 17:56 UTC (permalink / raw)
To: W. Trevor King; +Cc: git
2014-10-30 18:44 GMT+03:00 W. Trevor King <wking@tremily.us>:
> On Thu, Oct 30, 2014 at 06:39:56PM +0300, Dmitry Oksenchuk wrote:
>> We're in the middle of conversion of a large CVS repository (20
>> years, 70K commits, 1K branches, 10K tags) to Git and considering
>> two separate Git repositories: "historical" with CVS history and
>> "working" created without history from heads of active branches (10
>> active branches). This allows us to have small fast "working"
>> repository for developers who don't want to have full history
>> locally and ability to rewrite history in "historical" repository
>> (for example, to add parents to merge commits or to fix conversion
>> mistakes) without affecting commit hashes in "working" repository
>> (the hashes can be stored in bug tracker or in the code).
>
> A number of projects have done something like this (e.g. Linux).
> Modern Gits have good support for shallow repositories though, so I'd
> just make one full repository and leave it to developers to decide how
> deep they want their local copy to be.
Good point. Shallow clone allows a developer to have a small fast
repository if history is not needed.
But having new history in one repository with CVS history prevents us
from rewriting it in case of conversion mistakes or desire to restore
parents in merge commits.
Thanks,
Dmitry
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Joining historical repository using grafts or replace
2014-10-30 15:39 Joining historical repository using grafts or replace Dmitry Oksenchuk
2014-10-30 15:44 ` W. Trevor King
@ 2014-10-30 16:54 ` Christian Couder
2014-10-30 17:41 ` Dmitry Oksenchuk
1 sibling, 1 reply; 8+ messages in thread
From: Christian Couder @ 2014-10-30 16:54 UTC (permalink / raw)
To: Dmitry Oksenchuk; +Cc: git
Hi,
On Thu, Oct 30, 2014 at 4:39 PM, Dmitry Oksenchuk <oksenchuk89@gmail.com> wrote:
> Hello,
>
> We're in the middle of conversion of a large CVS repository (20 years,
> 70K commits, 1K branches, 10K tags) to Git and considering two
> separate Git repositories: "historical" with CVS history and "working"
> created without history from heads of active branches (10 active
> branches). This allows us to have small fast "working" repository for
> developers who don't want to have full history locally and ability to
> rewrite history in "historical" repository (for example, to add
> parents to merge commits or to fix conversion mistakes) without
> affecting commit hashes in "working" repository (the hashes can be
> stored in bug tracker or in the code).
This might be a good idea. Did you already test that the small
repository is really faster than the full repository?
> The first idea was to use grafs to join branch roots in "working"
> repository with branches in "historical" repository like in linux
> repository but it seems that grafts are known as a "horrible hack" (
> http://marc.info/?l=git&m=131127600030310&w=2
> http://permalink.gmane.org/gmane.comp.version-control.git/177153 )
>
> Since Git 1.6.5 "replace" can also be used to join the histories by
> replacing branch roots in "working" repository with branch heads in
> "historical" repository.
>
> Both grafts and replace will be used locally. Grafts is a bit easier
> to distribute (simple copying, replaces should be created via bash
> script).
First, you might want to have a look at:
http://git-scm.com/book/en/v2/Git-Tools-Replace
as it looks like it describes your use case very well.
> Are there any disadvantages of using grafts and replace? Will both of
> them be supported in future versions of Git?
My opinion is that grafts have no advantage compared to replace refs.
Once you have created your replace refs, they can be managed like
other git refs, so they are easier to distribute.
Basically if you want to get the full history on a computer you just need to do:
git fetch 'refs/replace/*:refs/replace/*'
Best,
Christian.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Joining historical repository using grafts or replace
2014-10-30 16:54 ` Christian Couder
@ 2014-10-30 17:41 ` Dmitry Oksenchuk
2014-10-31 8:45 ` Christian Couder
0 siblings, 1 reply; 8+ messages in thread
From: Dmitry Oksenchuk @ 2014-10-30 17:41 UTC (permalink / raw)
To: Christian Couder; +Cc: git
Hi Christian,
Thanks for your reply.
2014-10-30 19:54 GMT+03:00 Christian Couder <christian.couder@gmail.com>:
> On Thu, Oct 30, 2014 at 4:39 PM, Dmitry Oksenchuk <oksenchuk89@gmail.com> wrote:
>> We're in the middle of conversion of a large CVS repository (20 years,
>> 70K commits, 1K branches, 10K tags) to Git and considering two
>> separate Git repositories: "historical" with CVS history and "working"
>> created without history from heads of active branches (10 active
>> branches). This allows us to have small fast "working" repository for
>> developers who don't want to have full history locally and ability to
>> rewrite history in "historical" repository (for example, to add
>> parents to merge commits or to fix conversion mistakes) without
>> affecting commit hashes in "working" repository (the hashes can be
>> stored in bug tracker or in the code).
>
> This might be a good idea. Did you already test that the small
> repository is really faster than the full repository?
Yes, because of such amount of refs, push in "historical" repository
takes 12 sec, push in "working" repository takes 0.4 sec, push in
"joined" repository takes 2 sec. Local operations with history like
log and blame work with the same speed in "joined" repository as in
"historical" repository.
>> Are there any disadvantages of using grafts and replace? Will both of
>> them be supported in future versions of Git?
>
> My opinion is that grafts have no advantage compared to replace refs.
>
> Once you have created your replace refs, they can be managed like
> other git refs, so they are easier to distribute.
>
> Basically if you want to get the full history on a computer you just need to do:
>
> git fetch 'refs/replace/*:refs/replace/*'
That's true but you still need to have another remote with full
history because it has lots of tags and branches that will be cloned
by initial clone.
Regards,
Dmitry
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Joining historical repository using grafts or replace
2014-10-30 17:41 ` Dmitry Oksenchuk
@ 2014-10-31 8:45 ` Christian Couder
2014-10-31 15:47 ` Dmitry Oksenchuk
0 siblings, 1 reply; 8+ messages in thread
From: Christian Couder @ 2014-10-31 8:45 UTC (permalink / raw)
To: Dmitry Oksenchuk; +Cc: git
Hi Dmitry,
On Thu, Oct 30, 2014 at 6:41 PM, Dmitry Oksenchuk <oksenchuk89@gmail.com> wrote:
> 2014-10-30 19:54 GMT+03:00 Christian Couder <christian.couder@gmail.com>:
>>
>> This might be a good idea. Did you already test that the small
>> repository is really faster than the full repository?
>
> Yes, because of such amount of refs, push in "historical" repository
> takes 12 sec, push in "working" repository takes 0.4 sec, push in
> "joined" repository takes 2 sec. Local operations with history like
> log and blame work with the same speed in "joined" repository as in
> "historical" repository.
What does "joined" mean? Does it mean joined using grafts? Or joined
using replace refs? Or just the unsplit full repository?
Also what is interesting is if local operations work with the same
speed in the small "working" repository as in the unsplit full
repository.
>>> Are there any disadvantages of using grafts and replace? Will both of
>>> them be supported in future versions of Git?
>>
>> My opinion is that grafts have no advantage compared to replace refs.
>>
>> Once you have created your replace refs, they can be managed like
>> other git refs, so they are easier to distribute.
>>
>> Basically if you want to get the full history on a computer you just need to do:
>>
>> git fetch 'refs/replace/*:refs/replace/*'
By the way the above should be:
git fetch origin 'refs/replace/*:refs/replace/*'
> That's true but you still need to have another remote with full
> history because it has lots of tags and branches that will be cloned
> by initial clone.
Yeah, you might want to have another remote for that reason, but this
is true with both grafts and replace refs.
Best,
Christian.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Joining historical repository using grafts or replace
2014-10-31 8:45 ` Christian Couder
@ 2014-10-31 15:47 ` Dmitry Oksenchuk
2014-11-01 15:03 ` Christian Couder
0 siblings, 1 reply; 8+ messages in thread
From: Dmitry Oksenchuk @ 2014-10-31 15:47 UTC (permalink / raw)
To: Christian Couder; +Cc: git
Hi Christian,
> On Thu, Oct 30, 2014 at 6:41 PM, Dmitry Oksenchuk <oksenchuk89@gmail.com> wrote:
>> 2014-10-30 19:54 GMT+03:00 Christian Couder <christian.couder@gmail.com>:
>>>
>>> This might be a good idea. Did you already test that the small
>>> repository is really faster than the full repository?
>>
>> Yes, because of such amount of refs, push in "historical" repository
>> takes 12 sec, push in "working" repository takes 0.4 sec, push in
>> "joined" repository takes 2 sec. Local operations with history like
>> log and blame work with the same speed in "joined" repository as in
>> "historical" repository.
>
> What does "joined" mean? Does it mean joined using grafts? Or joined
> using replace refs? Or just the unsplit full repository?
It's joined using grafts or replace. In both cases performance is the same.
> Also what is interesting is if local operations work with the same
> speed in the small "working" repository as in the unsplit full
> repository.
Speed of operations like git diff, git add, git commit is exactly the
same in both repositories.
Operations like git log and git blame work much faster in repository
without history (not surprisingly :)
For example, git log in small repository takes 0.2 sec, in full
repository - 0.8 sec. git blame in full repository can take up to 9
sec for large files with long history.
Regards,
Dmitry
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Joining historical repository using grafts or replace
2014-10-31 15:47 ` Dmitry Oksenchuk
@ 2014-11-01 15:03 ` Christian Couder
0 siblings, 0 replies; 8+ messages in thread
From: Christian Couder @ 2014-11-01 15:03 UTC (permalink / raw)
To: Dmitry Oksenchuk; +Cc: git
Hi Dmitry,
On Fri, Oct 31, 2014 at 4:47 PM, Dmitry Oksenchuk <oksenchuk89@gmail.com> wrote:
> Hi Christian,
>
>> On Thu, Oct 30, 2014 at 6:41 PM, Dmitry Oksenchuk <oksenchuk89@gmail.com> wrote:
>>>
>>> Yes, because of such amount of refs, push in "historical" repository
>>> takes 12 sec, push in "working" repository takes 0.4 sec, push in
>>> "joined" repository takes 2 sec. Local operations with history like
>>> log and blame work with the same speed in "joined" repository as in
>>> "historical" repository.
>>
>> What does "joined" mean? Does it mean joined using grafts? Or joined
>> using replace refs? Or just the unsplit full repository?
>
> It's joined using grafts or replace. In both cases performance is the same.
>
>> Also what is interesting is if local operations work with the same
>> speed in the small "working" repository as in the unsplit full
>> repository.
>
> Speed of operations like git diff, git add, git commit is exactly the
> same in both repositories.
> Operations like git log and git blame work much faster in repository
> without history (not surprisingly :)
> For example, git log in small repository takes 0.2 sec, in full
> repository - 0.8 sec. git blame in full repository can take up to 9
> sec for large files with long history.
Ok, thanks for the information. I think it shows that indeed it makes
sense to split your repo.
Best,
Christian.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-11-01 15:03 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-30 15:39 Joining historical repository using grafts or replace Dmitry Oksenchuk
2014-10-30 15:44 ` W. Trevor King
2014-10-30 17:56 ` Dmitry Oksenchuk
2014-10-30 16:54 ` Christian Couder
2014-10-30 17:41 ` Dmitry Oksenchuk
2014-10-31 8:45 ` Christian Couder
2014-10-31 15:47 ` Dmitry Oksenchuk
2014-11-01 15:03 ` Christian Couder
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).