From: Andrew Sayers <andrew-git@pileofstuff.org>
To: Sam Vilain <sam@vilain.net>
Cc: Stephen Bash <bash@genarts.com>, Nathan Gray <n8gray@n8gray.org>,
Jonathan Nieder <jrnieder@gmail.com>, Jeff King <peff@peff.net>,
git@vger.kernel.org, Sverre Rabbelier <srabbelier@gmail.com>,
Dmitry Ivankov <divanorama@gmail.com>,
Ramkumar Ramachandra <artagnon@gmail.com>,
David Barr <davidbarr@google.com>
Subject: Re: [spf:guess] Re: Approaches to SVN to Git conversion (was: Re: [RFC] "Remote helper for Subversion" project)
Date: Wed, 07 Mar 2012 22:06:40 +0000 [thread overview]
Message-ID: <4F57DBF0.4060101@pileofstuff.org> (raw)
In-Reply-To: <4F56A4DF.8060807@vilain.net>
It sounds like we've approached two similar problems in similar ways, so
I'm curious about the differences where they exist. I've been reading
this message of yours from 18 months ago alongside this thread:
http://article.gmane.org/gmane.comp.version-control.git/150007
Unfortunately these comprise everything I know about Perforce.
I notice that git-p4raw stores all of its data in Postgres and provides
a programmatic interface for querying it, whereas I've focussed on
providing ASCII interfaces at relevant points. I can see how a DB store
would help manage the amount of data you'd need to process in a big
repository, but were there any other issues that drove you down this
route? Did you consider a text-based interface?
On 06/03/12 23:59, Sam Vilain wrote:
<snip>
> What I did for the Perl Perforce conversion is make this a multi–step
> process; first, the heuristic goes through and detects branches and
> merge parents. Then you do the actual export. If, however, the
> heuristic gets it wrong, then you can manually override the branch
> detection for a particular revision, which invalidates all of the
> _automatic_ decisions made for later revisions the next time you run it.
Could you give an example of overriding branch/merge detection? It
sounds like you're saying that if there's some problem detecting merge
parents in an early revision, then all future merges are ignored by the
script.
<snip>
> The manual input is extremely useful for bespoke conversions; there will
> always be warts in the history and no heuristic is perfect (even if you
> can supply your own set of expressions, a way to override it for just
> one revision is handy).
Again, would you mind providing a few examples? It sounds like you have
some edge cases that could be handled by extending the branch history
format, but I'd like to pin it down a bit more before discussing solutions.
<snip>
> 3. skip bad sections of history, for instance squash merging merges
> which happened over several commits (SVN and Perforce, of course,
> support insane piecemeal merging prohibited by git)
This is an excellent point I've stumbled past in my experiments without
realising what I was seeing. A simple SVN example might look like this:
svn add trunk branches
svn add trunk/foo trunk/bar
svn ci -m "Initial revision" # r1
svn cp trunk branches/my_branch
svn ci -m "Created my_branch" # r2
# edit files in my_branch
svn merge branches/my_branch/foo trunk/foo
svn ci -m "Merge my_branch -> trunk (1/3)" # r11
svn merge branches/my_branch/bar trunk/bar
svn ci -m "Merge my_branch -> trunk (2/3)" # r12
svn cp branches/my_branch/new_file trunk/new_file
svn ci -m "Merge my_branch -> trunk (3/3)" # r13
This strikes me as a sensibly cautious workflow in SVN, where merge
conflicts are common and changes are hard to revert. The best
representation for this in the current branch history format would be
something like this:
In r1, create branch "trunk"
In r2, create branch "branches/my_branch" from "trunk"
In r13, merge "branches/my_branch" r13 into "trunk"
In other words, pretend r11 and r12 are just normal commits, and that
r13 is a full merge. A more useful (and arguably more accurate)
representation would be possible if we extended the format a bit:
In r1, create branch "trunk"
In r2, create branch "branches/my_branch" from "trunk"
In r12, squash changes in "branches/my_branch"
In r13, squash changes in "branches/my_branch"
In r13, merge "branches/my_branch" r13 into "trunk"
Adding "squash" and "fixup" commands would let us represent the whole
messy business as a single commit, which is closer to what the user was
trying to say even if it's further from what they actually had to say.
- Andrew
next prev parent reply other threads:[~2012-03-08 1:00 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-03 12:27 [RFC] "Remote helper for Subversion" project David Barr
2012-03-03 12:41 ` David Barr
2012-03-04 7:54 ` Jonathan Nieder
2012-03-04 10:37 ` David Barr
2012-03-04 13:36 ` Andrew Sayers
2012-03-05 15:27 ` Approaches to SVN to Git conversion (was: Re: [RFC] "Remote helper for Subversion" project) Stephen Bash
2012-03-05 23:27 ` Approaches to SVN to Git conversion Andrew Sayers
2012-03-06 14:36 ` Stephen Bash
2012-03-06 19:29 ` Approaches to SVN to Git conversion (was: Re: [RFC] "Remote helper for Subversion" project) Nathan Gray
2012-03-06 20:35 ` Stephen Bash
2012-03-06 23:59 ` [spf:guess] " Sam Vilain
2012-03-07 22:06 ` Andrew Sayers [this message]
2012-03-07 23:15 ` [spf:guess,iffy] " Sam Vilain
2012-03-08 20:51 ` Andrew Sayers
2012-03-06 22:34 ` Approaches to SVN to Git conversion Andrew Sayers
2012-03-07 15:38 ` Sam Vilain
2012-03-07 20:28 ` Andrew Sayers
2012-03-07 22:33 ` Phil Hord
2012-03-07 23:08 ` Nathan Gray
2012-03-07 23:32 ` Andrew Sayers
2012-03-04 16:23 ` [RFC] "Remote helper for Subversion" project Jonathan Nieder
2012-03-27 3:58 ` Ramkumar Ramachandra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F57DBF0.4060101@pileofstuff.org \
--to=andrew-git@pileofstuff.org \
--cc=artagnon@gmail.com \
--cc=bash@genarts.com \
--cc=davidbarr@google.com \
--cc=divanorama@gmail.com \
--cc=git@vger.kernel.org \
--cc=jrnieder@gmail.com \
--cc=n8gray@n8gray.org \
--cc=peff@peff.net \
--cc=sam@vilain.net \
--cc=srabbelier@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).