git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: David Mansfield <david@cobite.com>
Cc: Steffen Prohaska <prohaska@zib.de>, git@vger.kernel.org
Subject: Re: [PATCH] cvsps/cvsimport: fix branch point calculation	and	broken branch imports
Date: Wed, 09 Apr 2008 03:53:20 +0200	[thread overview]
Message-ID: <47FC2190.3070303@alum.mit.edu> (raw)
In-Reply-To: <1207590845.17329.98.camel@gandalf.cobite.com>

David Mansfield wrote:
> The design goal of cvsps was always simply to show who did what and in
> what chronological order.  However, just with that, it's impossible to
> use for the purpose it is currently being used for.

Good point.

I re-read the cvsps manpage and found the information about "FUNKY" and
"INVALID" tags.  I'd forgotten that cvsps does the right thing in some
cases by warning the user about tags that are beyond its abilities to
describe.  (But there are other problems that cvsps doesn't warn about;
see below.)

Then I looked in the git-cvsimport code to see how it deals with FUNKY
and INVALID tags.  It does the *wrong* thing by explicitly ignoring
these warnings (!).  IMHO git-cvsimport should notice the **FUNKY** and
**INVALID** annotations and at least output a warning to the user that
the associated tags may not have been converted correctly.

But cvsps makes some other symbol-related mistakes, presumably in the
name of simplification.  These problems make it impossible for
git-cvsimport to generate accurate branches and tags, even if it were to
use fixup branches internally.  Moreover, many of these are silent
failures; there is no way that git-cvsimport could even determine that
the cvsps output is inadequate.  For example, if I understand correctly:

- cvsps pretends that a tag or branch is applied to a single snapshot of
the repository on a single branch, even though in reality:

  - some files might have been left out of the tag/branch (cvsps doesn't
give any indication if this was the case).  If this tag/branch is
checked out, the files that were not tagged are erroneously included.

  - the revisions not being tagged might not have all existed
contemporaneously (cvsps indicates these cases by marking the tags
**FUNKY** or **INVALID**).

  - a tag can be applied to different files on different branches; e.g.,
a tag can contain file1:1.3 (from trunk) and file2:1.2.2.1 (from some
other branch).  cvsps seems to pick one branch as source without
indicating a problem.  The inevitable result in cvsps is that the tag
includes the wrong contents for some files with no way to detect the error.

- If there is no commit on a branch, cvsps ignores the branch entirely.
 (Maybe this is fixed by your recent patch?)

- If there are multiple tags applied to the same set of file revisions
(for example, a daily tag and a release tag), cvsps silently ignores all
but one of them.  This causes unavoidable data loss in git-cvsimport.

There are lots of more complicated scenarios that I haven't tested
against cvsps...

Granted, cvsps was not written to be usable for converters.  But
regardless of whether the output is being read by a human or by another
program, its output can be wrong, and there is often no way to tell from
the output that it had a problem.  Maybe cvsps could emit warning
annotations in more of the situations that it punts on, and
git-cvsimport could pass these warnings along to the end user?
Otherwise people will believe that git-cvsimport is converting their
repository accurately when in fact it often silently produces incorrect
output.

> The place where the fixup branch logic needs to be is in git-cvsimport,
> not in cvsps.  Better yet, get rid of git-cvsimport and replace it with
> cvs2git if it works better.  

cvs2git hopefully gives a more accurate conversion of a CVS repository
-- it handles all of the cases described above, plus many more [1] --
but it is much slower and can't work incrementally.  So there is
definitely still demand for something like git-cvsimport.

Michael

[1] http://cvs2svn.tigris.org/features.html

  parent reply	other threads:[~2008-04-09  1:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-02  1:34 [PATCH] cvsps/cvsimport: fix branch point calculation and broken branch imports David Mansfield
2008-04-02 19:29 ` Junio C Hamano
2008-04-03  1:44   ` David Mansfield
2008-04-03  2:06     ` Junio C Hamano
2008-04-03  2:27       ` David Mansfield
2008-04-03  5:47 ` Steffen Prohaska
2008-04-03 13:49   ` David Mansfield
2008-04-04  9:52     ` Michael Haggerty
2008-04-07 17:54       ` David Mansfield
2008-04-07 18:07         ` Jean-François Veillette
2008-04-09  1:53         ` Michael Haggerty [this message]
2008-04-27  5:06           ` Ping Yin
2008-04-27  5:47             ` Michael Haggerty
2008-04-27  5:51               ` Ping Yin
2008-04-27  7:38                 ` Ping Yin
2008-04-27  7:43                   ` Ping Yin
2008-04-27  7:48                   ` Ping Yin
2008-04-27  8:48                     ` Ping Yin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47FC2190.3070303@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=david@cobite.com \
    --cc=git@vger.kernel.org \
    --cc=prohaska@zib.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).