From: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
To: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>,
Jonathan Nieder <jrnieder@gmail.com>,
git@vger.kernel.org, David Michael Barr <davidbarr@google.com>,
Andrew Sayers <andrew-git@pileofstuff.org>,
Dmitry Ivankov <divanorama@gmail.com>,
Sam Vilain <sam@vilain.net>
Subject: Re: GSOC remote-svn: branch detection
Date: Tue, 07 Aug 2012 23:26:26 +0200 [thread overview]
Message-ID: <3476983.FSv5Fk2g49@flobuntu> (raw)
In-Reply-To: <CALkWK0mu1=NEUZzB1VPAf0DU_nguuq_nJ-9Rn7Pj6zeNfoZGtA@mail.gmail.com>
On Saturday 04 August 2012 23:53:58 Ramkumar Ramachandra wrote:
> Hi,
>
> Florian Achleitner wrote:
> > 1. Import linearly and split later:
> I think this approach will be a lot less messy if you can cleanly
> separate the fetching component from the mapper. Currently, svndump
> re-creates the layout of the SVN repository. And the series you
> posted last week contains a patch that attaches a note with SVN
> metadata to each commit. Do you have thoughts on how the mapping will
> take place?
The mapping itself is currently a black box for me, it's internals could be
rather complex. It could get a function like is_branch_start, that is called
with a node ctx and tells if this is likely to be the start of branch. The
detected branches are stored and upcoming changes in the associated
directories are mapped to a commit on a branch.
The detection of branch starts and the list of existing branches can be taken
from whatever logic we want. So that's approx. the idea.
Currently I'm working on more basic preparations. I want to split the creation
of commits and the creation of blobs in svndump.c.
This is necessary because fast import requires a branch name as an argument to
the 'commit' command, and
currently a 'commit' command is started when a new revision is encountered in
the svndump.
But to decide on which branch the commit should go, or even if it will be more
than one commit, it is necessary to read all the nodes first.
To prevent buffering the node content, I want to replace the inline data format
(currently used) by 'blob' commands.
While parsing the dump, every node change creates a blob command to feed the
data immediately into fast-import while the node metadata (struct node_ctx) is
stored at least until the revision ends. Then the blobs can be put on a linear
master tree and other branch trees. The node metadata could also be read from
notes, if remapping branches.
That's not so easy to do, because the current implementation mixes tree-
operations and blob-operations heavily, and relies on only one global
node_ctx.
>
> Ram
Flo
prev parent reply other threads:[~2012-08-07 21:26 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-03 9:43 GSOC remote-svn: branch detection Florian Achleitner
2012-08-03 18:17 ` Jonathan Nieder
2012-08-04 6:40 ` Dmitry Ivankov
2012-08-04 18:23 ` Ramkumar Ramachandra
2012-08-07 21:26 ` Florian Achleitner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3476983.FSv5Fk2g49@flobuntu \
--to=florian.achleitner.2.6.31@gmail.com \
--cc=andrew-git@pileofstuff.org \
--cc=artagnon@gmail.com \
--cc=davidbarr@google.com \
--cc=divanorama@gmail.com \
--cc=git@vger.kernel.org \
--cc=jrnieder@gmail.com \
--cc=sam@vilain.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).