From: "Philippe Bruhat (BooK)" <philippe.bruhat@free.fr>
To: git@vger.kernel.org
Subject: Re: git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands"
Date: Thu, 12 Jun 2008 16:02:10 +0200 [thread overview]
Message-ID: <20080612140210.GE3830@plop> (raw)
In-Reply-To: <g2r66q$d3j$1@ger.gmane.org>
On Thu, Jun 12, 2008 at 02:52:40PM +0200, Michael J Gruber wrote:
> Yves Orton venit, vidit, dixit 12.06.2008 14:16:
>> We want a more or less linear repo as the result. This bug with
>> fast-export was the main showstopper in our efforts. However, I can
>> imagine that this is a problem that many people will want to solve. It
>> would be nice if there was an easier way to do it that what we currently
>> are doing (merging and munging multiple fast-export streams into a
>> single fast-import process). While at this point its probably academic
>> any suggestions as to the Best Way to do this would be very much
>> welcome.
>
> I've done something like this, "stitching" the history of different
> repos together in order to produce one repo, with each of the
> constituents in a subdir. What I did was an adaption of
>
> http://www.kernel.org/pub/software/scm/git/docs/howto/using-merge-subtree.html
>
> but as a multistep version:
What we did with Yves was a script doing the following:
- run git fast-export --all (and --topo-order now) on all the repositories
we wanted to merge and read blocks from them
- pass through all non-commit blocks (munging paths to put the content of
each repo in its own directory and renumbering marks to avoid clashes)
- keep a list of the next commit sent by fast-export for each repo
- select the oldest commit, and send it through, after stitching in the
right place (the point being to determine the "right place")
Actually, what we are trying to do is produce a single DAG from 2 or
more DAGs, while making sure that each "internal DAG" is the same.
(I'm pretty sure this is all trivial stuff for graph mathematicians)
Imagine we merged repositories A, B and C in a new repo D, if we replace
all nodes from D coming from B and C by vertexes, we will end up with
the original A graph.
We defined the "right place" as so: when having selected the next commit
to add to our new graph, each of its new parents is defined by "the last
alien child of the original parent" (or the original parent itself).
For example, if our new repository being built looks like:
--A7--A8--B4--B5--A9--B7
\
--B6
In this case, A9 was originally attached to A8, but to avoid unnecessary
branching in the new repo, we didn't attach it to A8, but to B5 (last
alien child of A8, descending the tree in a leftmost manner).
No A node will ever be attached to B6. The next A node originally
attached to A8 will be attached to B5 again, and one originally attached
to A9 will be attached to B7. Like this:
--A10
/
--A7--A8--B4--B5--A9--B7--A11
\
--B6
Now, if we remove all B nodes, we get this:
--A10
/
--A7--A8--A9--A11
which is the original A graph.
Finding the "last alien child" works fine with merges, too.
Of course, some commits from A might end up on an unrelated branch of B,
but all B branches are irrelevant to A anyway! :-)
--
Philippe Bruhat (BooK)
People are all unique- but some are more unique than others.
(Moral from Groo The Wanderer #22 (Epic))
prev parent reply other threads:[~2008-06-12 14:03 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-12 10:21 git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands" Yves Orton
2008-06-12 11:53 ` Johannes Sixt
2008-06-12 12:04 ` Yves Orton
2008-06-12 12:16 ` Yves Orton
2008-06-12 12:45 ` Johannes Sixt
2008-06-12 12:52 ` Michael J Gruber
2008-06-12 14:02 ` Philippe Bruhat (BooK) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080612140210.GE3830@plop \
--to=philippe.bruhat@free.fr \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).