* git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands"
@ 2008-06-12 10:21 Yves Orton
2008-06-12 11:53 ` Johannes Sixt
0 siblings, 1 reply; 7+ messages in thread
From: Yves Orton @ 2008-06-12 10:21 UTC (permalink / raw)
To: git
Hi,
Ive been working with git-fast-export a bit recently and Ive hit a bug
that is causing some trouble.
Essentially it seems that one of our repos git-fast-export fails to emit
the proper 'from' information for several commits in the repo. These
commits are emitted first without parent data even though their parents
ARE emitted later.
The code responsible for skipping the parent info is in
builtin-fast-export.c around line 402:
for (i = 0, p = commit->parents; p; p = p->next) {
int mark = get_object_mark(&p->item->object);
if (!mark)
continue;
if (i == 0)
printf("from :%d\n", mark);
else
printf("merge :%d\n", mark);
i++;
}
If i modify this loop to warn when skipping a parent I get a warning for
each of the "broken" commits. Apparently because they are emitted before
their parents the parents have no "mark" assigned to them (via
decoration) and thus are skipped in this emit process. This would make
sense for emitting a limited number of patches, but makes no sense when
the --all option is used. Ive tried to investigate further but i got
lost in a twisty maze of routines in revision.c, which apparently is
responsible for building a list of items to emit in the correct order.
However i think it is notable that both gitk and git log seem quite able
to deal with things properly, thus i find it a bit strange that
fast-export would get it wrong.
Unfortunately I have no idea how to create a minimal repo that
illustrates this problem.
Im currently on git version 1.5.6.rc2.29.g3ba9 (latest version from last
night), however this problem shows itself on 1.5.4.3 as well, as well as
an earlier version whose exact number i no longer know.
Other evidence that might be useful
git log --pretty=format:"%H:%P"
shows that every commit but one (the root) has parents. And gitk renders
the original repo fine. The repo can be cloned and etc, without trouble.
The problem seems to be strictly related to fast-export.
Im not on list so please cc me on any replies.
Thanks a lot!
Yves
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands"
2008-06-12 10:21 git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands" Yves Orton
@ 2008-06-12 11:53 ` Johannes Sixt
2008-06-12 12:04 ` Yves Orton
0 siblings, 1 reply; 7+ messages in thread
From: Johannes Sixt @ 2008-06-12 11:53 UTC (permalink / raw)
To: Yves Orton; +Cc: git
Yves Orton schrieb:
> Hi,
>
> Ive been working with git-fast-export a bit recently and Ive hit a bug
> that is causing some trouble.
>
> Essentially it seems that one of our repos git-fast-export fails to emit
> the proper 'from' information for several commits in the repo. These
> commits are emitted first without parent data even though their parents
> ARE emitted later.
Does it make a difference if you pass --topo-order to git fast-export?
(But I don't know for certain that this is even legal.)
-- Hannes
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands"
2008-06-12 11:53 ` Johannes Sixt
@ 2008-06-12 12:04 ` Yves Orton
2008-06-12 12:16 ` Yves Orton
0 siblings, 1 reply; 7+ messages in thread
From: Yves Orton @ 2008-06-12 12:04 UTC (permalink / raw)
To: Johannes Sixt; +Cc: git
On Thu, 2008-06-12 at 13:53 +0200, Johannes Sixt wrote:
> Yves Orton schrieb:
> > Hi,
> >
> > Ive been working with git-fast-export a bit recently and Ive hit a bug
> > that is causing some trouble.
> >
> > Essentially it seems that one of our repos git-fast-export fails to emit
> > the proper 'from' information for several commits in the repo. These
> > commits are emitted first without parent data even though their parents
> > ARE emitted later.
>
> Does it make a difference if you pass --topo-order to git fast-export?
> (But I don't know for certain that this is even legal.)
Yes it does make a difference. A big difference. That would be the
workaround I really needed. At least currently thats the way it looks,
i havent thoroughly tested the result yet but it certainly looks right.
Perhaps this should be enabled by default to avoid the problem i
encountered? At least until whatever the cause of the root problem is
identified and fixed.
Thanks a lot. ++ to you.
Cheers,
yves
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands"
2008-06-12 12:04 ` Yves Orton
@ 2008-06-12 12:16 ` Yves Orton
2008-06-12 12:45 ` Johannes Sixt
2008-06-12 12:52 ` Michael J Gruber
0 siblings, 2 replies; 7+ messages in thread
From: Yves Orton @ 2008-06-12 12:16 UTC (permalink / raw)
To: Johannes Sixt; +Cc: git
On Thu, 2008-06-12 at 14:04 +0200, Yves Orton wrote:
> On Thu, 2008-06-12 at 13:53 +0200, Johannes Sixt wrote:
> > Yves Orton schrieb:
> > > Hi,
> > >
> > > Ive been working with git-fast-export a bit recently and Ive hit a bug
> > > that is causing some trouble.
> > >
> > > Essentially it seems that one of our repos git-fast-export fails to emit
> > > the proper 'from' information for several commits in the repo. These
> > > commits are emitted first without parent data even though their parents
> > > ARE emitted later.
> >
> > Does it make a difference if you pass --topo-order to git fast-export?
> > (But I don't know for certain that this is even legal.)
>
> Yes it does make a difference. A big difference. That would be the
> workaround I really needed. At least currently thats the way it looks,
> i havent thoroughly tested the result yet but it certainly looks right.
>
> Perhaps this should be enabled by default to avoid the problem i
> encountered? At least until whatever the cause of the root problem is
> identified and fixed.
>
> Thanks a lot. ++ to you.
I should add that with this switch enabled the output order is correct,
HOWEVER the mark number of the first commit is unchanged from the
original. However the parent relationships are correctly restored and
the resulting repo has the correct SHA1 stamps. So it looks like the
original traversal order is wrong somehow, and that --topo-order fixes
it up after the fact.
But for my immediate needs this is the solution I needed. Again many
thanks.
BTW, for the record this was needed because we are trying to merge
multiple git repos into a single new git repo with each original repo
mapped into a subdirectory of the new repo, and with commit trees merged
in more or less the correct order (by date applied more or less). IOW we
dont want to have multiple "root commits" that are later merged.
We want a more or less linear repo as the result. This bug with
fast-export was the main showstopper in our efforts. However, I can
imagine that this is a problem that many people will want to solve. It
would be nice if there was an easier way to do it that what we currently
are doing (merging and munging multiple fast-export streams into a
single fast-import process). While at this point its probably academic
any suggestions as to the Best Way to do this would be very much
welcome.
Cheers and thanks to all you git developers for a great tool!
Yves
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands"
2008-06-12 12:16 ` Yves Orton
@ 2008-06-12 12:45 ` Johannes Sixt
2008-06-12 12:52 ` Michael J Gruber
1 sibling, 0 replies; 7+ messages in thread
From: Johannes Sixt @ 2008-06-12 12:45 UTC (permalink / raw)
To: Yves Orton; +Cc: git
Yves Orton schrieb:
> BTW, for the record this was needed because we are trying to merge
> multiple git repos into a single new git repo with each original repo
> mapped into a subdirectory of the new repo, and with commit trees merged
> in more or less the correct order (by date applied more or less). IOW we
> dont want to have multiple "root commits" that are later merged.
Try --date-order instead. It might work better for your task, and it still
offers a topologically correct order.
-- Hannes
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands"
2008-06-12 12:16 ` Yves Orton
2008-06-12 12:45 ` Johannes Sixt
@ 2008-06-12 12:52 ` Michael J Gruber
2008-06-12 14:02 ` Philippe Bruhat (BooK)
1 sibling, 1 reply; 7+ messages in thread
From: Michael J Gruber @ 2008-06-12 12:52 UTC (permalink / raw)
To: git
Yves Orton venit, vidit, dixit 12.06.2008 14:16:
> We want a more or less linear repo as the result. This bug with
> fast-export was the main showstopper in our efforts. However, I can
> imagine that this is a problem that many people will want to solve. It
> would be nice if there was an easier way to do it that what we currently
> are doing (merging and munging multiple fast-export streams into a
> single fast-import process). While at this point its probably academic
> any suggestions as to the Best Way to do this would be very much
> welcome.
I've done something like this, "stitching" the history of different
repos together in order to produce one repo, with each of the
constituents in a subdir. What I did was an adaption of
http://www.kernel.org/pub/software/scm/git/docs/howto/using-merge-subtree.html
but as a multistep version:
1. Create an empty repo
2. Add your to-be-stitched repos as remotes, say A B C
3. Create an empty commit
4. "git merge -s ours --no-commit a b c", where a b c are the root
commits of A B C
5. "git read-tree --prefix=dir-A/ -u a" and analogously for b c
6. "git commit", use the common commit message of those commits
Note that git refuses the merge (4.) into an empty (headless) repo,
which is why you need 3. There may be smarter ways.
If you don't care about recording the commits as (octopus) merges you
can skip 3. and 4. (4. just records merge info in the index).
Then, repeat:
3'. remove dir-A etc. (I think I used git-rm, I'm sorry I can't recall).
4. as above (if you want to record as merge)
5. as above
6. as above
If not all of A B C appear in every step then make sure to remove only
the ones (in 3'.) which you'll update in 5. You have to remove the dir
because read-tree wants it like that.
I used this for stitching 5 or 6 repos with a short history together, so
I repeated these steps manually rather than scripting it; all I needed
was a list of SHA1s listing which commits from A B C etc. corresponded
to the same "step" in the combined repo.
Cheers
Michael
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands"
2008-06-12 12:52 ` Michael J Gruber
@ 2008-06-12 14:02 ` Philippe Bruhat (BooK)
0 siblings, 0 replies; 7+ messages in thread
From: Philippe Bruhat (BooK) @ 2008-06-12 14:02 UTC (permalink / raw)
To: git
On Thu, Jun 12, 2008 at 02:52:40PM +0200, Michael J Gruber wrote:
> Yves Orton venit, vidit, dixit 12.06.2008 14:16:
>> We want a more or less linear repo as the result. This bug with
>> fast-export was the main showstopper in our efforts. However, I can
>> imagine that this is a problem that many people will want to solve. It
>> would be nice if there was an easier way to do it that what we currently
>> are doing (merging and munging multiple fast-export streams into a
>> single fast-import process). While at this point its probably academic
>> any suggestions as to the Best Way to do this would be very much
>> welcome.
>
> I've done something like this, "stitching" the history of different
> repos together in order to produce one repo, with each of the
> constituents in a subdir. What I did was an adaption of
>
> http://www.kernel.org/pub/software/scm/git/docs/howto/using-merge-subtree.html
>
> but as a multistep version:
What we did with Yves was a script doing the following:
- run git fast-export --all (and --topo-order now) on all the repositories
we wanted to merge and read blocks from them
- pass through all non-commit blocks (munging paths to put the content of
each repo in its own directory and renumbering marks to avoid clashes)
- keep a list of the next commit sent by fast-export for each repo
- select the oldest commit, and send it through, after stitching in the
right place (the point being to determine the "right place")
Actually, what we are trying to do is produce a single DAG from 2 or
more DAGs, while making sure that each "internal DAG" is the same.
(I'm pretty sure this is all trivial stuff for graph mathematicians)
Imagine we merged repositories A, B and C in a new repo D, if we replace
all nodes from D coming from B and C by vertexes, we will end up with
the original A graph.
We defined the "right place" as so: when having selected the next commit
to add to our new graph, each of its new parents is defined by "the last
alien child of the original parent" (or the original parent itself).
For example, if our new repository being built looks like:
--A7--A8--B4--B5--A9--B7
\
--B6
In this case, A9 was originally attached to A8, but to avoid unnecessary
branching in the new repo, we didn't attach it to A8, but to B5 (last
alien child of A8, descending the tree in a leftmost manner).
No A node will ever be attached to B6. The next A node originally
attached to A8 will be attached to B5 again, and one originally attached
to A9 will be attached to B7. Like this:
--A10
/
--A7--A8--B4--B5--A9--B7--A11
\
--B6
Now, if we remove all B nodes, we get this:
--A10
/
--A7--A8--A9--A11
which is the original A graph.
Finding the "last alien child" works fine with merges, too.
Of course, some commits from A might end up on an unrelated branch of B,
but all B branches are irrelevant to A anyway! :-)
--
Philippe Bruhat (BooK)
People are all unique- but some are more unique than others.
(Moral from Groo The Wanderer #22 (Epic))
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-06-12 14:03 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-12 10:21 git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands" Yves Orton
2008-06-12 11:53 ` Johannes Sixt
2008-06-12 12:04 ` Yves Orton
2008-06-12 12:16 ` Yves Orton
2008-06-12 12:45 ` Johannes Sixt
2008-06-12 12:52 ` Michael J Gruber
2008-06-12 14:02 ` Philippe Bruhat (BooK)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).