git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
@ 2008-04-02  1:34 David Mansfield
  2008-04-02 19:29 ` Junio C Hamano
  2008-04-03  5:47 ` Steffen Prohaska
  0 siblings, 2 replies; 18+ messages in thread
From: David Mansfield @ 2008-04-02  1:34 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1440 bytes --]

Hi Everyone,

This email addresses a long-standing bug with the cvsimport which is due
to a bugs in cvsps.  The bug is that branches can be forked off too
late.  

In case you're wondering, I'm actually the original author of cvsps,
which is behind the scenes for cvsimport.  I don't call myself
maintainer because I've hardly been that over the last few years.

Anyway, the fix to cvsps is attached (1st 2 patches) as well as the
patch to git-cvsimport.perl (2nd 2 patches) against the master branch as
of today's git repo.

The cvsps patches apply with fuzz against the 2.1 version which is out
there.

The full tarball of the latest cvsps version including this is available
on the website http://www.cobite.com/cvsps as well, the version is
2.2b1.

I plan to find time in the next week or so to merge all of the
outstanding patches from Yann Dirson's git repo, publish cvsps via a git
repo myself, and fix other bugs as time permits (including adding
support for multiple tags).

I'd mainly like feedback if anyone can test this.

Also, as I'm actually a newb. to this list, if I'm violating any rules,
such as how to post the patches, let me know.

Thanks,
David

P.S Also, as many people may have imported broken branches already, can
anyone thing of a way to fix the branch, (maybe with git-rebase or
something)?  The breakage affects, I believe, files not ever modified on
the branch until any given point in time on the branch...


[-- Attachment #2: 01-cvsps-add-branch-object.patch --]
[-- Type: application/mbox, Size: 3429 bytes --]

[-- Attachment #3: 02-cvsps-implement-branch-point-detection.patch --]
[-- Type: application/mbox, Size: 3100 bytes --]

[-- Attachment #4: 03-cvsimport-parse-new-cvsps-output.patch --]
[-- Type: application/mbox, Size: 1252 bytes --]

[-- Attachment #5: 04-cvsimport-redo-branch-creation-process.patch --]
[-- Type: application/mbox, Size: 4260 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-02  1:34 [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports David Mansfield
@ 2008-04-02 19:29 ` Junio C Hamano
  2008-04-03  1:44   ` David Mansfield
  2008-04-03  5:47 ` Steffen Prohaska
  1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2008-04-02 19:29 UTC (permalink / raw)
  To: David Mansfield; +Cc: git

David Mansfield <david@cobite.com> writes:

> In case you're wondering, I'm actually the original author of cvsps,
> which is behind the scenes for cvsimport.  I don't call myself
> maintainer because I've hardly been that over the last few years.
>
> Anyway, the fix to cvsps is attached (1st 2 patches) as well as the
> patch to git-cvsimport.perl (2nd 2 patches) against the master branch as
> of today's git repo.
>
> The cvsps patches apply with fuzz against the 2.1 version which is out
> there.

When output from an unfixed cvsps is fed to the updated cvsimport, does it
gracefully do the wrong thing (iow, create the same broken history not too
much worse than the original)?

> @@ -826,12 +824,9 @@ while (<CVS>) {
>  		$branch = $_;
>  		$state = 5;
>  	} elsif ($state == 5 and s/^Ancestor branch:\s+//) {
> -		s/\s+$//;
> -		$ancestor = $_;
> -		$ancestor = $opt_o if $ancestor eq "HEAD";
> +		# now ignored.  see 'Branches' below
>  		$state = 6;
>  	} elsif ($state == 5) {
> -		$ancestor = undef;
>  		$state = 6;
>  		redo;
>  	} elsif ($state == 6 and s/^Tag:\s+//) {

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-02 19:29 ` Junio C Hamano
@ 2008-04-03  1:44   ` David Mansfield
  2008-04-03  2:06     ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: David Mansfield @ 2008-04-03  1:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Wed, 2008-04-02 at 12:29 -0700, Junio C Hamano wrote:
> David Mansfield <david@cobite.com> writes:
> 
> > In case you're wondering, I'm actually the original author of cvsps,
> > which is behind the scenes for cvsimport.  I don't call myself
> > maintainer because I've hardly been that over the last few years.
> >
> > Anyway, the fix to cvsps is attached (1st 2 patches) as well as the
> > patch to git-cvsimport.perl (2nd 2 patches) against the master branch as
> > of today's git repo.
> >
> > The cvsps patches apply with fuzz against the 2.1 version which is out
> > there.
> 
> When output from an unfixed cvsps is fed to the updated cvsimport, does it
> gracefully do the wrong thing (iow, create the same broken history not too
> much worse than the original)?
> 
> > @@ -826,12 +824,9 @@ while (<CVS>) {
> >  		$branch = $_;
> >  		$state = 5;
> >  	} elsif ($state == 5 and s/^Ancestor branch:\s+//) {
> > -		s/\s+$//;
> > -		$ancestor = $_;
> > -		$ancestor = $opt_o if $ancestor eq "HEAD";
> > +		# now ignored.  see 'Branches' below
> >  		$state = 6;
> >  	} elsif ($state == 5) {
> > -		$ancestor = undef;
> >  		$state = 6;
> >  		redo;
> >  	} elsif ($state == 6 and s/^Tag:\s+//) {
> 

Not currently.  I'm just searching for failure modes for the feature at
the moment (I've already found one myself). 

You're right to point this out though.  Maybe someone can help me write
some tests for this?

Also, how does the git packaging (non-rpm version) specify and/or
guarantee dependencies are at a certain version anyway?

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-03  1:44   ` David Mansfield
@ 2008-04-03  2:06     ` Junio C Hamano
  2008-04-03  2:27       ` David Mansfield
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2008-04-03  2:06 UTC (permalink / raw)
  To: David Mansfield; +Cc: git

David Mansfield <david@cobite.com> writes:

> Also, how does the git packaging (non-rpm version) specify and/or
> guarantee dependencies are at a certain version anyway?

We cannot really do much with the old cvsimport out in the field, but I
was wondering more about automatic detection in new cvsimport.

The way I read 02-cvsps-implement-branch-point-detection.patch, you have
three cases:

 - "Ancestor branch:" is not followed by "Branches:" before "Log:"
   (old cvsps);

 - "Ancestor branch:" is followed by "Branches:" before "Log:" (new);

 - "Branches:" without "Ancestor branch:" (new);

So perhaps your 04-cvsimport-redo-branch-creation-process.patch, instead
of ignoring what "Ancestor branch:" said, can remember it has seen what
"ancestor" (which may be a bit off) information it was given, and when you
see "Log:" (by that time, you either have seen "Branches:" from new cvsps,
or you didn't see it from old cvsps) you can decide which vintage of cvsps
it is reading from.

Or something like that.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-03  2:06     ` Junio C Hamano
@ 2008-04-03  2:27       ` David Mansfield
  0 siblings, 0 replies; 18+ messages in thread
From: David Mansfield @ 2008-04-03  2:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


On Wed, 2008-04-02 at 19:06 -0700, Junio C Hamano wrote:
> David Mansfield <david@cobite.com> writes:
> 
> > Also, how does the git packaging (non-rpm version) specify and/or
> > guarantee dependencies are at a certain version anyway?
> 
> We cannot really do much with the old cvsimport out in the field, but I
> was wondering more about automatic detection in new cvsimport.
> 
> The way I read 02-cvsps-implement-branch-point-detection.patch, you have
> three cases:
> 
>  - "Ancestor branch:" is not followed by "Branches:" before "Log:"
>    (old cvsps);
> 
>  - "Ancestor branch:" is followed by "Branches:" before "Log:" (new);
> 
>  - "Branches:" without "Ancestor branch:" (new);
> 
> So perhaps your 04-cvsimport-redo-branch-creation-process.patch, instead
> of ignoring what "Ancestor branch:" said, can remember it has seen what
> "ancestor" (which may be a bit off) information it was given, and when you
> see "Log:" (by that time, you either have seen "Branches:" from new cvsps,
> or you didn't see it from old cvsps) you can decide which vintage of cvsps
> it is reading from.
> 
> Or something like that.

Quite right.  And also, one of the failure modes I've found is based on
real abuse of cvs, and the result is that cvsps shows the branch as
occurring AFTER the first commit on that branch.  Anyway, it's all
nonsense and hand waiving after all.  cvsps just creates an illusion
anyway.

But to fix it, I'll need something similar to the code that I removed
anyway, so that will definitely be in the mix when all is said and done.

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-02  1:34 [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports David Mansfield
  2008-04-02 19:29 ` Junio C Hamano
@ 2008-04-03  5:47 ` Steffen Prohaska
  2008-04-03 13:49   ` David Mansfield
  1 sibling, 1 reply; 18+ messages in thread
From: Steffen Prohaska @ 2008-04-03  5:47 UTC (permalink / raw)
  To: David Mansfield; +Cc: git


On Apr 2, 2008, at 3:34 AM, David Mansfield wrote:

> P.S Also, as many people may have imported broken branches already,  
> can
> anyone thing of a way to fix the branch, (maybe with git-rebase or
> something)?  The breakage affects, I believe, files not ever  
> modified on
> the branch until any given point in time on the branch...
>

The breakage you describe might be the same breakage that I recognized
in June 2007:

   http://article.gmane.org/gmane.comp.version-control.git/50736

At that time, I wrote a script (git-transplant) that fixed a broken
import from CVS for me:

   http://article.gmane.org/gmane.comp.version-control.git/50746

The discussion in

   http://article.gmane.org/gmane.comp.version-control.git/50789

explains the reason for the script a bit more detailed.

But note that I never finished git-transplant and I also failed to
convince anyone that the idea behind the script is of any general value.
Instead, I decided that git-cvsimport is not the right tools for me; and
since then I use parsecvs to convert my repositories.

         Steffen

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-03  5:47 ` Steffen Prohaska
@ 2008-04-03 13:49   ` David Mansfield
  2008-04-04  9:52     ` Michael Haggerty
  0 siblings, 1 reply; 18+ messages in thread
From: David Mansfield @ 2008-04-03 13:49 UTC (permalink / raw)
  To: Steffen Prohaska; +Cc: git


On Thu, 2008-04-03 at 07:47 +0200, Steffen Prohaska wrote:
> On Apr 2, 2008, at 3:34 AM, David Mansfield wrote:
> 
> > P.S Also, as many people may have imported broken branches already,  
> > can
> > anyone thing of a way to fix the branch, (maybe with git-rebase or
> > something)?  The breakage affects, I believe, files not ever  
> > modified on
> > the branch until any given point in time on the branch...
> >
> 
> The breakage you describe might be the same breakage that I recognized
> in June 2007:
> 
>    http://article.gmane.org/gmane.comp.version-control.git/50736
> 
> At that time, I wrote a script (git-transplant) that fixed a broken
> import from CVS for me:
> 
>    http://article.gmane.org/gmane.comp.version-control.git/50746
> 
> The discussion in
> 
>    http://article.gmane.org/gmane.comp.version-control.git/50789
> 
> explains the reason for the script a bit more detailed.
> 
> But note that I never finished git-transplant and I also failed to
> convince anyone that the idea behind the script is of any general value.
> Instead, I decided that git-cvsimport is not the right tools for me; and
> since then I use parsecvs to convert my repositories.
> 


Yes.  It's the same problem.  It will be fixed with the above patches
once they stabilize.  I'll look at the transplant thing too.  It looks
like a good idea.

The main issue with git-cvsimport stems from an unfixable problem.
cvsps's design goal is to show commits in chronological order.  Based
solely on this data, it's impossible to always reconstruct a branch
point (or a tag) since a person could have committed files after someone
else's commit, but not done an update then tagged.  

So some files are from before the 'other' user's commit, and some files
after.  What can you do?  

It's not per se a flaw in cvsps, it always wanted to show commits in
chronological order, but it is a severe limitation in using cvsps to
generate changesets for git.

By engineering a direct tool (such as parsecvs, I presume) these
obstacles can be overcome by constructing some commits that were never
made by the actual users of the cvs repo in order to get it right.

I'm not sure exactly how this is done, because I've never looked at
parsecvs.

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-03 13:49   ` David Mansfield
@ 2008-04-04  9:52     ` Michael Haggerty
  2008-04-07 17:54       ` David Mansfield
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Haggerty @ 2008-04-04  9:52 UTC (permalink / raw)
  To: David Mansfield; +Cc: Steffen Prohaska, git

David Mansfield wrote:
> The main issue with git-cvsimport stems from an unfixable problem.
> cvsps's design goal is to show commits in chronological order.  Based
> solely on this data, it's impossible to always reconstruct a branch
> point (or a tag) since a person could have committed files after someone
> else's commit, but not done an update then tagged.  

Just to be more explicit, I think you are talking about a situation like
this:

1. Add file1:1.1 and file2:1.1 to repository.
2. User1 modifies file1 and commits file1:1.2.
   ...some non-negligible amount of time passes...
3. User2 modifies file2 and commits file2:1.2.
4. User2, without updating file1 to revision 1.2, adds a tag.

This results in a tag that refers to file1:1.1 and file2:1.2, even
though those two revisions never appeared in the repository at the same
time.

> So some files are from before the 'other' user's commit, and some files
> after.  What can you do?  

You can do the only thing that is consistent with the CVS
history--create the tag not from a single source revision but from
multiple revisions.  Unfortunately, git cannot handle this directly, but
there is a workaround using a "fixup branch" [1].

cvs2svn/cvs2git [2] creates a "fixup branch", copies file1:1.1 and
file2:1.2 onto that branch, then creates the tag from the fixup branch.
 This ensures that checking the tag out of git gives the same file
contents as checking the tag out of CVS.  I think that git-cvsimport
gets this wrong (!?!)

It is your framing of the problem that is leading to the impossibility.
 CVS's design does *not* require that a tag or branch is created in a
single commit, nor that it is created from a single source revision.
Trying to impose these artificial constraints means that the resulting
git repository is inconsistent with the CVS repository in quite common
circumstances.

> It's not per se a flaw in cvsps, it always wanted to show commits in
> chronological order, but it is a severe limitation in using cvsps to
> generate changesets for git.

cvs2git always creates commits in chronological order too, but its
output is by design always consistent with the CVS record.

> By engineering a direct tool (such as parsecvs, I presume) these
> obstacles can be overcome by constructing some commits that were never
> made by the actual users of the cvs repo in order to get it right.
> 
> I'm not sure exactly how this is done, because I've never looked at
> parsecvs.

cvs2git's design is documented quite extensively, if you are interested
[3].  Parsecvs, AFAIK, uses a similar approach.

Michael

[1] http://www.kernel.org/pub/software/scm/git/docs/git-fast-import.html
[2] http://cvs2svn.tigris.org/cvs2git.html
[3] http://cvs2svn.tigris.org/svn/cvs2svn/trunk/doc/design-notes.txt

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-04  9:52     ` Michael Haggerty
@ 2008-04-07 17:54       ` David Mansfield
  2008-04-07 18:07         ` Jean-François Veillette
  2008-04-09  1:53         ` Michael Haggerty
  0 siblings, 2 replies; 18+ messages in thread
From: David Mansfield @ 2008-04-07 17:54 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Steffen Prohaska, git


On Fri, 2008-04-04 at 11:52 +0200, Michael Haggerty wrote:
> David Mansfield wrote:
> > The main issue with git-cvsimport stems from an unfixable problem.
> > cvsps's design goal is to show commits in chronological order.  Based
> > solely on this data, it's impossible to always reconstruct a branch
> > point (or a tag) since a person could have committed files after someone
> > else's commit, but not done an update then tagged.  
> 
> Just to be more explicit, I think you are talking about a situation like
> this:
> 
> 1. Add file1:1.1 and file2:1.1 to repository.
> 2. User1 modifies file1 and commits file1:1.2.
>    ...some non-negligible amount of time passes...
> 3. User2 modifies file2 and commits file2:1.2.
> 4. User2, without updating file1 to revision 1.2, adds a tag.
> 
> This results in a tag that refers to file1:1.1 and file2:1.2, even
> though those two revisions never appeared in the repository at the same
> time.
> 

More or less, yes.  It gets worse if a user does 'cvs update' in a
directory, or on an individual file.


> > So some files are from before the 'other' user's commit, and some files
> > after.  What can you do?  
> 
> You can do the only thing that is consistent with the CVS
> history--create the tag not from a single source revision but from
> multiple revisions.  Unfortunately, git cannot handle this directly, but
> there is a workaround using a "fixup branch" [1].
> 
> cvs2svn/cvs2git [2] creates a "fixup branch", copies file1:1.1 and
> file2:1.2 onto that branch, then creates the tag from the fixup branch.
>  This ensures that checking the tag out of git gives the same file
> contents as checking the tag out of CVS.  I think that git-cvsimport
> gets this wrong (!?!)
> 
> It is your framing of the problem that is leading to the impossibility.
>  CVS's design does *not* require that a tag or branch is created in a
> single commit, nor that it is created from a single source revision.
> Trying to impose these artificial constraints means that the resulting
> git repository is inconsistent with the CVS repository in quite common
> circumstances.
> 

It's not 'my framing of the problem.'  It's 'the design goal of cvsps is
not compatible with the desire to use the output of cvsps to create a
git repository.'  See the difference?

> > It's not per se a flaw in cvsps, it always wanted to show commits in
> > chronological order, but it is a severe limitation in using cvsps to
> > generate changesets for git.
> 
> cvs2git always creates commits in chronological order too, but its
> output is by design always consistent with the CVS record.
> 

Yes.  That's what cvs2git was designed for.  Look at the name.  In order
to create the 'fixup' branch, you have to make some out of operations,
which is fine if that's what your design goal is.

The design goal of cvsps was always simply to show who did what and in
what chronological order.  However, just with that, it's impossible to
use for the purpose it is currently being used for.

The 'fixup branch' sounds like a really great idea and an elegant
solution.

> > By engineering a direct tool (such as parsecvs, I presume) these
> > obstacles can be overcome by constructing some commits that were never
> > made by the actual users of the cvs repo in order to get it right.
> > 
> > I'm not sure exactly how this is done, because I've never looked at
> > parsecvs.
> 
> cvs2git's design is documented quite extensively, if you are interested
> [3].  Parsecvs, AFAIK, uses a similar approach.
> 

I'm quite happy that there are other tools, and even more happy if they
already fix every bug that git-cvsimport has.

I was simply addressing the bug from the standpoint of: this issue can
be fixed without compromising what cvsps wants to be as a tool.

The place where the fixup branch logic needs to be is in git-cvsimport,
not in cvsps.  Better yet, get rid of git-cvsimport and replace it with
cvs2git if it works better.  

However, if possible, I'd like to fix problems with the
cvsps/git-cvsimport if possible, unless someone can tell me for sure
that it's obsolete and noone uses it.

Thanks,
David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-07 17:54       ` David Mansfield
@ 2008-04-07 18:07         ` Jean-François Veillette
  2008-04-09  1:53         ` Michael Haggerty
  1 sibling, 0 replies; 18+ messages in thread
From: Jean-François Veillette @ 2008-04-07 18:07 UTC (permalink / raw)
  To: David Mansfield; +Cc: Michael Haggerty, Steffen Prohaska, Git


> However, if possible, I'd like to fix problems with the cvsps/git- 
> cvsimport if possible, unless someone can tell me for sure that  
> it's obsolete and noone uses it.

I do use it, please fix the duo cvsps/git-cvsimport (if possible).
The fact that it's integrated with git make it very useful and handy.
It is working well for almost all of my cvs repo that I track with git.
If it would work for all of them it would be wonderful !

- jfv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-07 17:54       ` David Mansfield
  2008-04-07 18:07         ` Jean-François Veillette
@ 2008-04-09  1:53         ` Michael Haggerty
  2008-04-27  5:06           ` Ping Yin
  1 sibling, 1 reply; 18+ messages in thread
From: Michael Haggerty @ 2008-04-09  1:53 UTC (permalink / raw)
  To: David Mansfield; +Cc: Steffen Prohaska, git

David Mansfield wrote:
> The design goal of cvsps was always simply to show who did what and in
> what chronological order.  However, just with that, it's impossible to
> use for the purpose it is currently being used for.

Good point.

I re-read the cvsps manpage and found the information about "FUNKY" and
"INVALID" tags.  I'd forgotten that cvsps does the right thing in some
cases by warning the user about tags that are beyond its abilities to
describe.  (But there are other problems that cvsps doesn't warn about;
see below.)

Then I looked in the git-cvsimport code to see how it deals with FUNKY
and INVALID tags.  It does the *wrong* thing by explicitly ignoring
these warnings (!).  IMHO git-cvsimport should notice the **FUNKY** and
**INVALID** annotations and at least output a warning to the user that
the associated tags may not have been converted correctly.

But cvsps makes some other symbol-related mistakes, presumably in the
name of simplification.  These problems make it impossible for
git-cvsimport to generate accurate branches and tags, even if it were to
use fixup branches internally.  Moreover, many of these are silent
failures; there is no way that git-cvsimport could even determine that
the cvsps output is inadequate.  For example, if I understand correctly:

- cvsps pretends that a tag or branch is applied to a single snapshot of
the repository on a single branch, even though in reality:

  - some files might have been left out of the tag/branch (cvsps doesn't
give any indication if this was the case).  If this tag/branch is
checked out, the files that were not tagged are erroneously included.

  - the revisions not being tagged might not have all existed
contemporaneously (cvsps indicates these cases by marking the tags
**FUNKY** or **INVALID**).

  - a tag can be applied to different files on different branches; e.g.,
a tag can contain file1:1.3 (from trunk) and file2:1.2.2.1 (from some
other branch).  cvsps seems to pick one branch as source without
indicating a problem.  The inevitable result in cvsps is that the tag
includes the wrong contents for some files with no way to detect the error.

- If there is no commit on a branch, cvsps ignores the branch entirely.
 (Maybe this is fixed by your recent patch?)

- If there are multiple tags applied to the same set of file revisions
(for example, a daily tag and a release tag), cvsps silently ignores all
but one of them.  This causes unavoidable data loss in git-cvsimport.

There are lots of more complicated scenarios that I haven't tested
against cvsps...

Granted, cvsps was not written to be usable for converters.  But
regardless of whether the output is being read by a human or by another
program, its output can be wrong, and there is often no way to tell from
the output that it had a problem.  Maybe cvsps could emit warning
annotations in more of the situations that it punts on, and
git-cvsimport could pass these warnings along to the end user?
Otherwise people will believe that git-cvsimport is converting their
repository accurately when in fact it often silently produces incorrect
output.

> The place where the fixup branch logic needs to be is in git-cvsimport,
> not in cvsps.  Better yet, get rid of git-cvsimport and replace it with
> cvs2git if it works better.  

cvs2git hopefully gives a more accurate conversion of a CVS repository
-- it handles all of the cases described above, plus many more [1] --
but it is much slower and can't work incrementally.  So there is
definitely still demand for something like git-cvsimport.

Michael

[1] http://cvs2svn.tigris.org/features.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-09  1:53         ` Michael Haggerty
@ 2008-04-27  5:06           ` Ping Yin
  2008-04-27  5:47             ` Michael Haggerty
  0 siblings, 1 reply; 18+ messages in thread
From: Ping Yin @ 2008-04-27  5:06 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: David Mansfield, Steffen Prohaska, git

On Wed, Apr 9, 2008 at 9:53 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>  > The place where the fixup branch logic needs to be is in git-cvsimport,
>  > not in cvsps.  Better yet, get rid of git-cvsimport and replace it with
>  > cvs2git if it works better.
>
>  cvs2git hopefully gives a more accurate conversion of a CVS repository
>  -- it handles all of the cases described above, plus many more [1] --
>  but it is much slower and can't work incrementally.  So there is
>  definitely still demand for something like git-cvsimport.
>

These days i tried to convert the cvs repository into git. I really
want the conversion to be as accurate as possible. However, the cvs
repository has been tagged in a very bad style which makes
git-cvsimport or cvsps not work well.

cvs2git sounds to be the right tool i should try. Unfortualely, i
can't touch the cvs repository directly. So is it possible to use
cvs2git in the remote host instead of the host of the cvs repository
just as git-cvsimport does? Yes, i know it can't now. I just wonder
whether it is possible to implement.

I choose to reply to this thread instead of opening a new one because
i think this reply of Michael has told much shortcommings of
git-cvsimport or cvsps but had got no replies yet.

-- 
Ping Yin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-27  5:06           ` Ping Yin
@ 2008-04-27  5:47             ` Michael Haggerty
  2008-04-27  5:51               ` Ping Yin
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Haggerty @ 2008-04-27  5:47 UTC (permalink / raw)
  To: Ping Yin; +Cc: David Mansfield, Steffen Prohaska, git

Ping Yin wrote:
> These days i tried to convert the cvs repository into git. I really
> want the conversion to be as accurate as possible. However, the cvs
> repository has been tagged in a very bad style which makes
> git-cvsimport or cvsps not work well.
> 
> cvs2git sounds to be the right tool i should try. Unfortualely, i
> can't touch the cvs repository directly. So is it possible to use
> cvs2git in the remote host instead of the host of the cvs repository
> just as git-cvsimport does? Yes, i know it can't now. I just wonder
> whether it is possible to implement.

cvs2svn/cvs2git itself can't work with remote repositories.  It would be
enough if you could just get a copy of the repository; obviously you
don't need to use the original.

If you can't get a copy of the CVS repository directly, you might be
able to recreate it indirectly via information read over the CVS
protocol using a tool like CVSsuck [1,2].  I have no experience with
CVSsuck, so if you try it out, please let us know whether you were
successful.

Presumably some CVSsuck-like functionality could be built directly into
cvs2git, but given that this request hasn't come up very often and that
the two tools can presumably be used in concert, it doesn't seem worth
the effort.

Michael

[1] http://cvs.m17n.org/~akr/cvssuck/
[2] http://cvs2svn.tigris.org/faq.html#repoaccess

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-27  5:47             ` Michael Haggerty
@ 2008-04-27  5:51               ` Ping Yin
  2008-04-27  7:38                 ` Ping Yin
  0 siblings, 1 reply; 18+ messages in thread
From: Ping Yin @ 2008-04-27  5:51 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: David Mansfield, Steffen Prohaska, git

On Sun, Apr 27, 2008 at 1:47 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> Ping Yin wrote:
>  > These days i tried to convert the cvs repository into git. I really
>  > want the conversion to be as accurate as possible. However, the cvs
>  > repository has been tagged in a very bad style which makes
>  > git-cvsimport or cvsps not work well.
>  >
>  > cvs2git sounds to be the right tool i should try. Unfortualely, i
>  > can't touch the cvs repository directly. So is it possible to use
>  > cvs2git in the remote host instead of the host of the cvs repository
>  > just as git-cvsimport does? Yes, i know it can't now. I just wonder
>  > whether it is possible to implement.
>
>  cvs2svn/cvs2git itself can't work with remote repositories.  It would be
>  enough if you could just get a copy of the repository; obviously you
>  don't need to use the original.
>
>  If you can't get a copy of the CVS repository directly, you might be
>  able to recreate it indirectly via information read over the CVS
>  protocol using a tool like CVSsuck [1,2].  I have no experience with
>  CVSsuck, so if you try it out, please let us know whether you were
>  successful.
>

THX. If i try out cvssuck, i will let you know.


-- 
Ping Yin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-27  5:51               ` Ping Yin
@ 2008-04-27  7:38                 ` Ping Yin
  2008-04-27  7:43                   ` Ping Yin
  2008-04-27  7:48                   ` Ping Yin
  0 siblings, 2 replies; 18+ messages in thread
From: Ping Yin @ 2008-04-27  7:38 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: David Mansfield, Steffen Prohaska, git

On Sun, Apr 27, 2008 at 1:51 PM, Ping Yin <pkufranky@gmail.com> wrote:
> On Sun, Apr 27, 2008 at 1:47 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>  > Ping Yin wrote:
>  >  > These days i tried to convert the cvs repository into git. I really
>  >  > want the conversion to be as accurate as possible. However, the cvs
>  >  > repository has been tagged in a very bad style which makes
>  >  > git-cvsimport or cvsps not work well.
>  >  >
>  >  > cvs2git sounds to be the right tool i should try. Unfortualely, i
>  >  > can't touch the cvs repository directly. So is it possible to use
>  >  > cvs2git in the remote host instead of the host of the cvs repository
>  >  > just as git-cvsimport does? Yes, i know it can't now. I just wonder
>  >  > whether it is possible to implement.
>  >
>  >  cvs2svn/cvs2git itself can't work with remote repositories.  It would be
>  >  enough if you could just get a copy of the repository; obviously you
>  >  don't need to use the original.
>  >
>  >  If you can't get a copy of the CVS repository directly, you might be
>  >  able to recreate it indirectly via information read over the CVS
>  >  protocol using a tool like CVSsuck [1,2].  I have no experience with
>  >  CVSsuck, so if you try it out, please let us know whether you were
>  >  successful.
>  >
>
>  THX. If i try out cvssuck, i will let you know.
>

Great, i succeed. And the result is exactly what i want!

However, it is so so slow.

Here is a example to convert a module util from cvs to git
--------------------------------------------------------------------------------------------
$ cvssuck $CVSROOT util                   <1>
$ mkdir util/CVSROOT                        <2>
$ edit cvs2svn-git.options and cvs2svn-example.options
   ( change run_options.add_project and ctx.cvs_log_decorder)
$ cvs2svn --options=cvs2svn-git.options
$ mkdir util.git && cd util.git && git init
$ cat ../cvs2svn-tmp/git-{blob,dump}.dat  | git-fast-import
-------------------------------------------------------------------------------------------
<1> very slow, about 30 minutes for a very small module.
       Other steps are fast enough.
<2> I have to create a dir util/CVSROOT to avoid the error
"util is not a CVS repository, nor a path within a CVS repository.  A
CVS repository contains a CVSROOT directory within its root
directory."


-- 
Ping Yin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-27  7:38                 ` Ping Yin
@ 2008-04-27  7:43                   ` Ping Yin
  2008-04-27  7:48                   ` Ping Yin
  1 sibling, 0 replies; 18+ messages in thread
From: Ping Yin @ 2008-04-27  7:43 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: David Mansfield, Steffen Prohaska, git

On Sun, Apr 27, 2008 at 3:38 PM, Ping Yin <pkufranky@gmail.com> wrote:

>  <1> very slow, about 30 minutes for a very small module.

More accurate, about 500 commits and 300 files


-- 
Ping Yin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-27  7:38                 ` Ping Yin
  2008-04-27  7:43                   ` Ping Yin
@ 2008-04-27  7:48                   ` Ping Yin
  2008-04-27  8:48                     ` Ping Yin
  1 sibling, 1 reply; 18+ messages in thread
From: Ping Yin @ 2008-04-27  7:48 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: David Mansfield, Steffen Prohaska, git

On Sun, Apr 27, 2008 at 3:38 PM, Ping Yin <pkufranky@gmail.com> wrote:
>
> On Sun, Apr 27, 2008 at 1:51 PM, Ping Yin <pkufranky@gmail.com> wrote:
>  > On Sun, Apr 27, 2008 at 1:47 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>  >  > Ping Yin wrote:
>  >  >  > These days i tried to convert the cvs repository into git. I really
>  >  >  > want the conversion to be as accurate as possible. However, the cvs
>  >  >  > repository has been tagged in a very bad style which makes
>  >  >  > git-cvsimport or cvsps not work well.
>  >  >  >
>  >  >  > cvs2git sounds to be the right tool i should try. Unfortualely, i
>  >  >  > can't touch the cvs repository directly. So is it possible to use
>  >  >  > cvs2git in the remote host instead of the host of the cvs repository
>  >  >  > just as git-cvsimport does? Yes, i know it can't now. I just wonder
>  >  >  > whether it is possible to implement.
>  >  >
>  >  >  cvs2svn/cvs2git itself can't work with remote repositories.  It would be
>  >  >  enough if you could just get a copy of the repository; obviously you
>  >  >  don't need to use the original.
>  >  >
>  >  >  If you can't get a copy of the CVS repository directly, you might be
>  >  >  able to recreate it indirectly via information read over the CVS
>  >  >  protocol using a tool like CVSsuck [1,2].  I have no experience with
>  >  >  CVSsuck, so if you try it out, please let us know whether you were
>  >  >  successful.
>  >  >
>  >
>  >  THX. If i try out cvssuck, i will let you know.
>  >
>
>  Great, i succeed. And the result is exactly what i want!
>

Not exactly, for my another conversion.

$ git log --pretty=online  --name-status x64_UI_071204
724eb47 \
This commit was manufactured by cvs2svn to create tag 'x64_UI_071204'.
9362987 add support of writing cookies of 'fromid';
M       logqueue.c
M       ui.c
M       ui.h

Should we avoid recording the commit 724eb47 since it is the same with
commit 9362987 (no content change)?

-- 
Ping Yin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports
  2008-04-27  7:48                   ` Ping Yin
@ 2008-04-27  8:48                     ` Ping Yin
  0 siblings, 0 replies; 18+ messages in thread
From: Ping Yin @ 2008-04-27  8:48 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: David Mansfield, Steffen Prohaska, git

On Sun, Apr 27, 2008 at 3:48 PM, Ping Yin <pkufranky@gmail.com> wrote:
>  >  >
>  >  >  THX. If i try out cvssuck, i will let you know.
>  >  >
>  >
>  >  Great, i succeed. And the result is exactly what i want!
>  >
>
>  Not exactly, for my another conversion.
>
>  $ git log --pretty=online  --name-status x64_UI_071204
>  724eb47 \
>  This commit was manufactured by cvs2svn to create tag 'x64_UI_071204'.
>  9362987 add support of writing cookies of 'fromid';
>  M       logqueue.c
>  M       ui.c
>  M       ui.h
>
>  Should we avoid recording the commit 724eb47 since it is the same with
>  commit 9362987 (no content change)?
>

I have found the related issue in
http://cvs2svn.tigris.org/issues/show_bug.cgi?id=117 and the solution
contrib/git-move-tags.pl (in cvs2svn trunk, not released yet). Sorry
for the noise.


-- 
Ping Yin

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2008-04-27  8:49 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-02  1:34 [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports David Mansfield
2008-04-02 19:29 ` Junio C Hamano
2008-04-03  1:44   ` David Mansfield
2008-04-03  2:06     ` Junio C Hamano
2008-04-03  2:27       ` David Mansfield
2008-04-03  5:47 ` Steffen Prohaska
2008-04-03 13:49   ` David Mansfield
2008-04-04  9:52     ` Michael Haggerty
2008-04-07 17:54       ` David Mansfield
2008-04-07 18:07         ` Jean-François Veillette
2008-04-09  1:53         ` Michael Haggerty
2008-04-27  5:06           ` Ping Yin
2008-04-27  5:47             ` Michael Haggerty
2008-04-27  5:51               ` Ping Yin
2008-04-27  7:38                 ` Ping Yin
2008-04-27  7:43                   ` Ping Yin
2008-04-27  7:48                   ` Ping Yin
2008-04-27  8:48                     ` Ping Yin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).