git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jon Smirl" <jonsmirl@gmail.com>
To: "Marko Macek" <marko.macek@gmx.net>
Cc: git@vger.kernel.org
Subject: Re: Some tips for doing a CVS importer
Date: Sun, 26 Nov 2006 10:35:44 -0500	[thread overview]
Message-ID: <9e4733910611260735g2b18e9d1p51a0dca153282cc7@mail.gmail.com> (raw)
In-Reply-To: <456969DA.6090702@gmx.net>

On 11/26/06, Marko Macek <marko.macek@gmx.net> wrote:
> Jon Smirl wrote:
>
> >
> > SVN hides the mini branch by creating a symbol like this:
> >
> > Symbol XXX, change set 70
> > copy All from change set 50
> > copy file A from change set 55
> > copy file B,C from change set 60
> > copy file D from change set 61
> > copy file E,F,G from change set 63
> > copy file H from change set 67
> >
> > It has to do all of those copies because the change sets weren't
> > constructed while taking symbol dependency information into account.
> >
> > Symbol XXX can't copy from change set 69 because commits from after
> > the symbol was created are included in change sets 51-69.
>
> Sometimes it is not actually possible to have a 'simple' symbol, even
> by following proper symbol dependencies.
>
> Some situations:
> - tags on some files are readjusted later, or tagged separately with an older
>  version
> - tag is created with a -D "date" and the file times are not in sync
> - tag is created from a mixed-revision working copy

I agree that there are a few exceptions to making simple symbols. But
the current cvs2svn makes no attempt at all to preserve simple
symbols. In my attempts at converting Mozilla 60% of the symbols ended
up as tiny branches. I investigated a couple by hand and was able to
rearrange things to create simple symbols in every case I looked at.

This can be dealt with during the topological sort. If there are
complex symbol creations you will end up with loops during the sort
process. At that point you need to start breaking up change sets to
remove the loops. You would use a heuristic at this point, something
like try breaking up to ten commit change sets to preserve a symbol,
if you can't preserve it with 10 breaks then break the symbol once and
try again, repeat until the loop is gone.

The current cvs2svn code effectively implements a heuristic when the
commits are always preserved at the expense of breaking the symbols.
Since some commit comments are very common comments (blank ones) those
commits get combined into bigger change sets and trash the simple
symbols.

Another note for doing a converter. When combining things into change
sets, for git import the comments in the branches should not be mixed
between branches and the trunk when detecting change set. Git doesn't
allow simultaneous commits to the trunk and branches.

> While in the cases of 'time warp' the revision sequence should be
> considered more important than timestamps, this is not necessarily
> true for tags, since it's easily possible to create them on mixed
> revisions.
>
> cvs2svn also has a problem with vendor branches because it creates
> tags/branches that contain files from vendor branch by copying some
> files from the trunk and other files from the vendor branch.
> If the vendor branch/tag was only used for the initial import,
> it's IMO best to skip them in the conversion (this needs a patch).
> There are however problems because keyword expansion causes file
> differences.
>
> It seems that mozilla CVS repository has vendor branches/imports in
> some parts of the tree.

I never got around to checking out problems with vendor branches in Mozilla.


>
> Mark
>
>


-- 
Jon Smirl

  reply	other threads:[~2006-11-26 15:36 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-20 21:49 Some tips for doing a CVS importer Jon Smirl
2006-11-20 23:03 ` Martin Langhoff
2006-11-20 23:37   ` Jon Smirl
2006-11-21  0:29     ` Martin Langhoff
2006-11-21  0:55       ` Carl Worth
2006-11-21  1:40         ` Jon Smirl
2006-11-21  6:39           ` Shawn Pearce
2006-11-21 19:56             ` lamikr
2006-11-21 20:05               ` Shawn Pearce
2006-11-23 19:45                 ` Robin Rosenberg
2006-11-25  6:59                   ` Shawn Pearce
2006-11-21 20:03             ` Petr Baudis
2006-11-21 20:15               ` Shawn Pearce
2006-11-21 20:22               ` Johannes Schindelin
2006-11-23  9:10                 ` Johannes Sixt
2006-11-21 20:40               ` Martin Langhoff
2006-11-21  1:53       ` Jon Smirl
2006-11-26 10:18         ` Marko Macek
2006-11-26 15:35           ` Jon Smirl [this message]
2006-11-26 16:11             ` Marko Macek
2006-11-26 17:51               ` Jon Smirl
2006-11-27 11:29               ` Michael Haggerty
2006-11-21  6:43       ` Shawn Pearce
2006-11-27 11:24 ` Michael Haggerty
2006-11-27 11:51   ` Markus Schiltknecht
2006-11-27 22:09     ` Michael Haggerty
2006-11-28 15:18       ` Markus Schiltknecht
2006-11-30  0:35         ` Michael Haggerty
2006-11-30  0:45           ` Daniel Jacobowitz
2006-11-27 15:20   ` Jon Smirl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e4733910611260735g2b18e9d1p51a0dca153282cc7@mail.gmail.com \
    --to=jonsmirl@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=marko.macek@gmx.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).