From: Ramkumar Ramachandra <artagnon@gmail.com>
To: Sam Vilain <sam@vilain.net>
Cc: Jonathan Nieder <jrnieder@gmail.com>,
Git Mailing List <git@vger.kernel.org>,
David Michael Barr <david.barr@cordelta.com>,
Sverre Rabbelier <srabbelier@gmail.com>,
"Shawn O. Pearce" <spearce@spearce.org>,
Daniel Shahaf <d.s@daniel.shahaf.name>,
Eric Wong <normalperson@yhbt.net>
Subject: Re: [GSoC update extra!] git-remote-svn: Week 8
Date: Wed, 30 Jun 2010 14:45:53 +0200 [thread overview]
Message-ID: <20100630124553.GA30999@debian> (raw)
In-Reply-To: <1277862665.23613.8.camel@wilber>
Hi Sam,
Sam Vilain writes:
> On Thu, 2010-06-24 at 13:07 -0500, Jonathan Nieder wrote:
> > operation. In other words, it needs the tree for
> > http://path/to/some/svn/root/branches@r11. This does not correspond
> > to a single git tree, since the content of each branch has been given
> > its own commit.
>
> I wrote at length about this near the beginning of the project;
> essentially, figuring out whether particular paths are roots or not is
> not defined, as SVN does not distinguish between them (a misfeature
> cargo culted from Perforce). It becomes a data mining problem, you have
> this scattered data, and you have to find a history inside.
Right. Implementing git-svn on top of git-remote-svn might not be a
bad idea.
> As I recommended before, it probably makes more sense to keep a "remote
> tracking" branch which mirrors the *entire* repository, and sort out
> efficient ways to convert SVN revision paths like the above into tree
> IDs.
>
> I consider it very important to separate the data import and tracking
> stage from the data mining stage.
We're following this approach. At the moment, we're just focusing on
getting all the data directly from SVN into the Git store. Instead of
building trees for each SVN revision, we've found a way to do it
inside the Git object store: we're currently ironing out the details,
and I'll post an update about this shortly.
> Once the data mining stage is well solved, then it makes sense to look
> at ways that a tracking branch which only tracks a part of the
> Subversion repository can be achieved. In the simple case, where no
> repository re-organisation or cross-project renames have occurred it is
> relatively simple. But in general I think this is a harder problem,
> which cannot always be solved without intervention - and so not
> necessary to be solved in short-term milestones. As you are
> discovering, it is a can of worms which you avoid if you know you always
> have the complete SVN repository available.
Right. I'm not convinced that it necessarily requires user
intervention though: can you systematically prove that enough
information is not available without user intervention using an
example? Or is it possible, but simply too difficult (and not worth
the effort) to mine out the data?
-- Ram
next prev parent reply other threads:[~2010-06-30 12:44 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-24 13:33 [GSoC update extra!] git-remote-svn: Week 8 Ramkumar Ramachandra
2010-06-24 17:39 ` Jonathan Nieder
2010-06-24 18:07 ` Jonathan Nieder
2010-06-24 21:32 ` Eric Wong
2010-06-30 1:51 ` Sam Vilain
2010-06-30 12:45 ` Ramkumar Ramachandra [this message]
2010-07-01 3:38 ` Sam Vilain
2010-06-30 2:20 ` Sam Vilain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100630124553.GA30999@debian \
--to=artagnon@gmail.com \
--cc=d.s@daniel.shahaf.name \
--cc=david.barr@cordelta.com \
--cc=git@vger.kernel.org \
--cc=jrnieder@gmail.com \
--cc=normalperson@yhbt.net \
--cc=sam@vilain.net \
--cc=spearce@spearce.org \
--cc=srabbelier@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).