git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Nieder <jrnieder@gmail.com>
To: Andrew Sayers <andrew-git@pileofstuff.org>
Cc: Florian Achleitner <florian.achleitner@student.tugraz.at>,
	Ramkumar Ramachandra <artagnon@gmail.com>,
	David Barr <davidbarr@google.com>,
	Git Mailing List <git@vger.kernel.org>,
	Sverre Rabbelier <srabbelier@gmail.com>,
	Dmitry Ivankov <divanorama@gmail.com>
Subject: Re: GSOC Proposal draft: git-remote-svn
Date: Mon, 2 Apr 2012 19:09:45 -0500	[thread overview]
Message-ID: <20120403000945.GA15075@burratino> (raw)
In-Reply-To: <4F7A3450.7000302@pileofstuff.org>

Andrew Sayers wrote:

> Sorry, that wasn't clear.  I meant commands that just expose a single
> primitive bit of functionality (like git-commit-tree) instead of those
> that present an abstract interface to the whole git machinery (like
> git-fast-import).

Ok.  I think you are misunderstanding the purpose of fast-import[1] but
it doesn't take away from what you're saying.

> I agree it's possible to use fast-import for this problem, but it seems
> like it's redundant after svn-fe has already loaded everything into git.

Right, I missed your point here before.  The fundamental question is
not about what commands to use but about the order of operations.

1. In one scheme, first you import the whole tree without splitting it
   into branches, with a tool like svn-fe.  Afterwards, you
   postprocess the resulting repository with tools like "git
   filter-branch --subdirectory-filter".  The result of the import can
   depend on all revisions --- you can say, in rev 1, "I'm not sure
   whether this new directory is a branch; let me see how it develops
   by rev 1000 to decide how to process it".

2. In another scheme, you only import the subset of the repository
   you are interested in.  This is what git-svn does, for example.
   This requires the branch discovery to happen at the same time as
   the import, because otherwise there is no way to tell what subset
   of the repository you are actually interested in.

3. Lastly, in yet another scheme, you import the whole tree and it is
   split into branches on the fly.  The advantages relative to (1) are:

   - impatient people can peek at the partial result of the import as
     it happens

   - the result of importing rev n is guaranteed to depend only on
     revs <= n, so different people importing at different times will
     get the same commits (assuming nobody is rewriting early history
     behind the scenes) and it is obvious how to support incremental
     importants to expand a repository with all revs <= n to a
     repository with all revs <= 2n

   However, if splitting branches only can happen during the initial
   import, that makes it harder to tweak the configuration and try
   again to see what changes.

The relevant technical difference is that in the naive implementation
of scheme (2) you can make use of arbitrary information available over
svn protocol, in naive scheme (3) you can only use information that
makes it into the fast-import stream, and in naive scheme (1) you can
only use information that makes it into the actual git repository.  So
to use scheme (1) you need to make sure svn-fe stores all interesting
data in a visible way, including copyfrom info (which is not a bad
idea anyway).

[...]
> The point I was making in IRC was that (so far as I understand)
> fast-import doesn't let you pass trees around in this way, but instead
> requires you to transmit the contents of all the changed files.

fast-import's "ls" command allows exactly what you are talking about,
and svn-fe uses it to copy subtrees from earlier revs into later ones
when it receives an "svn cp" command.

See [2] for some work that preexists that.

Did I understand correctly?
Jonathan

[1] By acting as a single process that takes a stream of commands it
really is able to do something that no other plumbing command can do.
[2] http://thread.gmane.org/gmane.comp.version-control.git/158375

  reply	other threads:[~2012-04-03  0:10 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-19 14:42 GSoC intro Florian Achleitner
2012-03-19 21:31 ` Andrew Sayers
2012-03-20 12:25 ` Florian Achleitner
2012-03-20 13:19 ` David Barr
2012-03-21 21:16   ` Florian Achleitner
2012-03-26 11:06     ` Ramkumar Ramachandra
2012-03-27 13:53       ` Florian Achleitner
2012-04-02  8:30         ` GSOC Proposal draft: git-remote-svn Florian Achleitner
2012-04-02 11:00           ` Ramkumar Ramachandra
2012-04-02 20:57           ` Jonathan Nieder
2012-04-02 23:04             ` Jonathan Nieder
2012-04-03  7:49             ` Florian Achleitner
2012-04-03 18:48               ` Jonathan Nieder
2012-04-05 16:18             ` Tomas Carnecky
2012-04-02 22:17           ` Andrew Sayers
2012-04-02 22:29             ` Jonathan Nieder
2012-04-02 23:20               ` Andrew Sayers
2012-04-03  0:09                 ` Jonathan Nieder [this message]
2012-04-03 21:53                   ` Andrew Sayers
2012-04-03 22:21                     ` Jonathan Nieder
2012-04-05 13:36           ` Florian Achleitner
2012-04-05 15:47             ` Dmitry Ivankov
2012-04-09 18:59             ` Stephen Bash
2012-04-10 17:17             ` Jonathan Nieder
2012-04-10 22:30               ` Andrew Sayers
2012-04-10 23:46                 ` Jonathan Nieder
2012-04-11 19:09                 ` Florian Achleitner
2012-04-14 22:57                   ` Andrew Sayers
2012-04-11 15:51               ` Jakub Narebski
2012-04-11 15:56                 ` Jonathan Nieder
2012-04-11 19:20               ` Florian Achleitner
2012-04-11 19:44                 ` Dmitry Ivankov
2012-04-11 19:53                 ` Jonathan Nieder
2012-04-11 22:43                   ` Andrew Sayers
2012-04-12  9:02                   ` Thomas Rast
2012-04-12 15:28               ` Florian Achleitner
2012-04-12 22:30                 ` Andrew Sayers
2012-04-14 20:09                   ` Florian Achleitner
2012-04-14 21:35                     ` Andrew Sayers
2012-04-15  3:13                       ` Stephen Bash
2012-04-13 19:19                 ` Jonathan Nieder
2012-04-14 20:15                   ` Florian Achleitner
2012-04-18 20:16               ` Florian Achleitner
2012-04-19 12:26                 ` Florian Achleitner
2012-03-28  8:09       ` GSoC intro Miles Bader
2012-03-28  9:30         ` Dmitry Ivankov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120403000945.GA15075@burratino \
    --to=jrnieder@gmail.com \
    --cc=andrew-git@pileofstuff.org \
    --cc=artagnon@gmail.com \
    --cc=davidbarr@google.com \
    --cc=divanorama@gmail.com \
    --cc=florian.achleitner@student.tugraz.at \
    --cc=git@vger.kernel.org \
    --cc=srabbelier@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).