From: Michael Haggerty <mhagger@alum.mit.edu>
To: Steffen Prohaska <prohaska@zib.de>
Cc: git@vger.kernel.org, users@cvs2svn.tigris.org
Subject: Re: cvs2svn conversion directly to git ready for experimentation
Date: Thu, 02 Aug 2007 19:23:57 +0200 [thread overview]
Message-ID: <46B2132D.7090304@alum.mit.edu> (raw)
In-Reply-To: <65F1862F-4DF2-4A52-9FD5-20802AEACDAB@zib.de>
Steffen Prohaska wrote:
> On Aug 1, 2007, at 2:09 AM, Michael Haggerty wrote:
>> I am looking forward to your feedback. Even better would be if somebody
>> wants to join forces on this project. I would be happy to supply the
>> cvs2svn knowledge if you can bring the git experience.
>
> I tried it with revision trunk@3930 of cvs2svn. The results are as follows.
Thanks for the feedback!
> cvs2svn created a lot of branches that are not present in CVS,
> with names identical to CVS tags. Apparently these branches are
> used to create a commit matching a certain CVS tag.
That is correct. This is something that I plan to work on, at least for
tags that can be created from a single source commit.
> The branching structure looks, ... hmm ..., interesting. cvs2svn
> manufactured commits to get the branching points right.
> Apparently our CVS has some weired commits like 'unlabeled-1.1.1'
> and two other named tags (maybe vendor branches?) that cause
> these manufactured commits. In gitk I see long lines running
> parallel to the cvs trunk all down to these weired CVS tags. They
> are not very useful, altough they might be correct. Note,
> parsecvs imports our repository without such basically useless
> links. However, I can't verify if parsecvs gets something wrong.
Branches with names like "unlabeled-1.1.1" come from CVS branches for
which the revisions are still contained in the RCS files but for which
the branch name has been deleted. These wreak havoc on cvs2svn's
attempt to find simple branch sources and cause a proliferation of
basically useless branches. The main problem is that cvs2svn does not
attempt to figure out that "unlabeled-1.2.4" in one file might be the
same as "unlabeled-1.2.6" in another etc.
An "unlabeled-1.1.1", in particular, means that the branch whose name
was deleted was a vendor branch. The deletion of a vendor branch name
can cause even more mayhem.
In most cases it makes sense to exclude the unlabeled branches. After
all, somebody tried to delete them, so they can't be that important,
right? Use --exclude='unlabeled-.*', or add a line like this to your
options file:
ctx.symbol_strategy.add_rule(ExcludeRegexpStrategyRule(r'unlabeled-.*'))
. This can of course cause problems if other branches or tags were
created that branched off of the unlabeled branch. In such cases the
dependent branches/tags might have to be excluded too.
> Other branches are created over a couple of commits mixing in
> several branches (maybe again our weired commits already
> mentioned). See branching1.png, branching2.png, branching3.png.
> [ I have to apologize, our cvs repository contains proprietary
> information, so I can't publish it's history freely. ]
This can definitely be caused by unlabeled branches. It can also be
caused by branches rooted in a vendor branch. In many cases, such
branches can actually be grafted onto trunk, but cvs2svn does not (yet)
attempt this.
> cvs2svn is the first tool besided parsecvs that worked for me,
> that is imported the whole repository, passed the basic test of
> matching checkouts from cvs and git, and got the one suspicious
> commit right that I'm using for verifying the branching points.
>
> [ I have no time to go into the details of all these tests.
> Therefore only a very short summary:
> All tools needed basic cleanup of a few corrupted ,v files and
> ,v files that were duplicated in Attic.
> git-cvsimport fails to create branches at the right commit.
> fromcvs's togit surrendered during the import.
> fromcvs's tohg accepted more of the history, but finally
> surrendered as well.
> parsecvs works for me (crashes on corrupted ,v files).
> cvs2svn followed by git-svnimport create wrong state at the
> tips of branches.
> cvs2svn direct git import works for me (reports corrupted ,v files).
> ]
Thanks very much for this interesting summary.
> Right now, I'd prefer the import by parsecvs because of the
> simpler history. However, I don't know if I loose history
> information by doing so. I'd start by a run of cvs2svn to validate
> the overall structure of the CVS repository. Dealing with corruption
> in the CVS repository seems to be superior in cvs2svn. It reports
> errors when parsecvs just crashes.
If excluding the unlabeled branches does not fix things for you, I
suggest checking out the first revision on such a branch, and comparing
the results from CVS, from parsecvs, and from cvs2svn. It *should* be
that the version of the file from the vendor branch is included in the
working copy. cvs2svn should handle this correctly. I am curious
whether parsecvs does.
Michael
next prev parent reply other threads:[~2007-08-02 17:24 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-01 0:09 cvs2svn conversion directly to git ready for experimentation Michael Haggerty
2007-08-01 0:41 ` Johannes Schindelin
2007-08-01 22:09 ` Jakub Narebski
2007-08-02 16:58 ` Michael Haggerty
2007-08-02 23:44 ` Jon Smirl
2007-08-02 8:49 ` Steffen Prohaska
2007-08-02 17:23 ` Michael Haggerty [this message]
2007-08-02 19:22 ` Marko Macek
2007-08-02 23:59 ` Jon Smirl
2007-08-05 7:58 ` Oswald Buddenhagen
2007-08-02 17:35 ` Simon 'corecode' Schubert
2007-08-02 19:13 ` Steffen Prohaska
2007-08-02 19:29 ` Simon 'corecode' Schubert
2007-08-02 20:21 ` Robin Rosenberg
[not found] ` <200708022221.13129.robin.rosenberg.lists-RgPrefM1rjDQT0dZR+AlfA@public.gmane.org>
2007-08-02 20:31 ` Lübbe Onken
2007-08-02 20:32 ` Lübbe Onken
2007-08-02 20:33 ` Lübbe Onken
2007-08-02 22:02 ` Steffen Prohaska
2007-08-02 22:50 ` Simon 'corecode' Schubert
2007-08-02 23:50 ` Michael Haggerty
2007-08-03 8:40 ` Simon 'corecode' Schubert
2007-08-04 8:28 ` Steffen Prohaska
2007-08-03 3:07 ` Shawn O. Pearce
2007-08-02 23:37 ` Michael Haggerty
2007-08-02 20:43 ` Linus Torvalds
2007-08-02 23:19 ` Michael Haggerty
2007-08-03 3:12 ` Shawn O. Pearce
2007-08-02 23:55 ` Jon Smirl
[not found] ` <8b65902a0708010438s24d16109k601b52c04cf9c066@mail.gmail.com>
2007-08-02 15:34 ` Michael Haggerty
2007-08-02 23:08 ` Martin Langhoff
2007-08-03 4:03 ` Johannes Schindelin
2007-08-03 6:48 ` Steffen Prohaska
2007-08-03 7:10 ` Steffen Prohaska
2007-08-03 8:36 ` Michael Haggerty
2007-08-03 14:35 ` Patwardhan, Rajesh
2007-08-03 15:41 ` Jon Smirl
2007-08-03 16:42 ` Patwardhan, Rajesh
2007-08-03 18:58 ` Michael Haggerty
2007-08-03 20:16 ` Jon Smirl
2007-08-03 20:27 ` Jon Smirl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46B2132D.7090304@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=git@vger.kernel.org \
--cc=prohaska@zib.de \
--cc=users@cvs2svn.tigris.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).