* git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands" @ 2008-06-12 10:21 Yves Orton 2008-06-12 11:53 ` Johannes Sixt 0 siblings, 1 reply; 7+ messages in thread From: Yves Orton @ 2008-06-12 10:21 UTC (permalink / raw) To: git Hi, Ive been working with git-fast-export a bit recently and Ive hit a bug that is causing some trouble. Essentially it seems that one of our repos git-fast-export fails to emit the proper 'from' information for several commits in the repo. These commits are emitted first without parent data even though their parents ARE emitted later. The code responsible for skipping the parent info is in builtin-fast-export.c around line 402: for (i = 0, p = commit->parents; p; p = p->next) { int mark = get_object_mark(&p->item->object); if (!mark) continue; if (i == 0) printf("from :%d\n", mark); else printf("merge :%d\n", mark); i++; } If i modify this loop to warn when skipping a parent I get a warning for each of the "broken" commits. Apparently because they are emitted before their parents the parents have no "mark" assigned to them (via decoration) and thus are skipped in this emit process. This would make sense for emitting a limited number of patches, but makes no sense when the --all option is used. Ive tried to investigate further but i got lost in a twisty maze of routines in revision.c, which apparently is responsible for building a list of items to emit in the correct order. However i think it is notable that both gitk and git log seem quite able to deal with things properly, thus i find it a bit strange that fast-export would get it wrong. Unfortunately I have no idea how to create a minimal repo that illustrates this problem. Im currently on git version 1.5.6.rc2.29.g3ba9 (latest version from last night), however this problem shows itself on 1.5.4.3 as well, as well as an earlier version whose exact number i no longer know. Other evidence that might be useful git log --pretty=format:"%H:%P" shows that every commit but one (the root) has parents. And gitk renders the original repo fine. The repo can be cloned and etc, without trouble. The problem seems to be strictly related to fast-export. Im not on list so please cc me on any replies. Thanks a lot! Yves ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands" 2008-06-12 10:21 git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands" Yves Orton @ 2008-06-12 11:53 ` Johannes Sixt 2008-06-12 12:04 ` Yves Orton 0 siblings, 1 reply; 7+ messages in thread From: Johannes Sixt @ 2008-06-12 11:53 UTC (permalink / raw) To: Yves Orton; +Cc: git Yves Orton schrieb: > Hi, > > Ive been working with git-fast-export a bit recently and Ive hit a bug > that is causing some trouble. > > Essentially it seems that one of our repos git-fast-export fails to emit > the proper 'from' information for several commits in the repo. These > commits are emitted first without parent data even though their parents > ARE emitted later. Does it make a difference if you pass --topo-order to git fast-export? (But I don't know for certain that this is even legal.) -- Hannes ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands" 2008-06-12 11:53 ` Johannes Sixt @ 2008-06-12 12:04 ` Yves Orton 2008-06-12 12:16 ` Yves Orton 0 siblings, 1 reply; 7+ messages in thread From: Yves Orton @ 2008-06-12 12:04 UTC (permalink / raw) To: Johannes Sixt; +Cc: git On Thu, 2008-06-12 at 13:53 +0200, Johannes Sixt wrote: > Yves Orton schrieb: > > Hi, > > > > Ive been working with git-fast-export a bit recently and Ive hit a bug > > that is causing some trouble. > > > > Essentially it seems that one of our repos git-fast-export fails to emit > > the proper 'from' information for several commits in the repo. These > > commits are emitted first without parent data even though their parents > > ARE emitted later. > > Does it make a difference if you pass --topo-order to git fast-export? > (But I don't know for certain that this is even legal.) Yes it does make a difference. A big difference. That would be the workaround I really needed. At least currently thats the way it looks, i havent thoroughly tested the result yet but it certainly looks right. Perhaps this should be enabled by default to avoid the problem i encountered? At least until whatever the cause of the root problem is identified and fixed. Thanks a lot. ++ to you. Cheers, yves ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands" 2008-06-12 12:04 ` Yves Orton @ 2008-06-12 12:16 ` Yves Orton 2008-06-12 12:45 ` Johannes Sixt 2008-06-12 12:52 ` Michael J Gruber 0 siblings, 2 replies; 7+ messages in thread From: Yves Orton @ 2008-06-12 12:16 UTC (permalink / raw) To: Johannes Sixt; +Cc: git On Thu, 2008-06-12 at 14:04 +0200, Yves Orton wrote: > On Thu, 2008-06-12 at 13:53 +0200, Johannes Sixt wrote: > > Yves Orton schrieb: > > > Hi, > > > > > > Ive been working with git-fast-export a bit recently and Ive hit a bug > > > that is causing some trouble. > > > > > > Essentially it seems that one of our repos git-fast-export fails to emit > > > the proper 'from' information for several commits in the repo. These > > > commits are emitted first without parent data even though their parents > > > ARE emitted later. > > > > Does it make a difference if you pass --topo-order to git fast-export? > > (But I don't know for certain that this is even legal.) > > Yes it does make a difference. A big difference. That would be the > workaround I really needed. At least currently thats the way it looks, > i havent thoroughly tested the result yet but it certainly looks right. > > Perhaps this should be enabled by default to avoid the problem i > encountered? At least until whatever the cause of the root problem is > identified and fixed. > > Thanks a lot. ++ to you. I should add that with this switch enabled the output order is correct, HOWEVER the mark number of the first commit is unchanged from the original. However the parent relationships are correctly restored and the resulting repo has the correct SHA1 stamps. So it looks like the original traversal order is wrong somehow, and that --topo-order fixes it up after the fact. But for my immediate needs this is the solution I needed. Again many thanks. BTW, for the record this was needed because we are trying to merge multiple git repos into a single new git repo with each original repo mapped into a subdirectory of the new repo, and with commit trees merged in more or less the correct order (by date applied more or less). IOW we dont want to have multiple "root commits" that are later merged. We want a more or less linear repo as the result. This bug with fast-export was the main showstopper in our efforts. However, I can imagine that this is a problem that many people will want to solve. It would be nice if there was an easier way to do it that what we currently are doing (merging and munging multiple fast-export streams into a single fast-import process). While at this point its probably academic any suggestions as to the Best Way to do this would be very much welcome. Cheers and thanks to all you git developers for a great tool! Yves ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands" 2008-06-12 12:16 ` Yves Orton @ 2008-06-12 12:45 ` Johannes Sixt 2008-06-12 12:52 ` Michael J Gruber 1 sibling, 0 replies; 7+ messages in thread From: Johannes Sixt @ 2008-06-12 12:45 UTC (permalink / raw) To: Yves Orton; +Cc: git Yves Orton schrieb: > BTW, for the record this was needed because we are trying to merge > multiple git repos into a single new git repo with each original repo > mapped into a subdirectory of the new repo, and with commit trees merged > in more or less the correct order (by date applied more or less). IOW we > dont want to have multiple "root commits" that are later merged. Try --date-order instead. It might work better for your task, and it still offers a topologically correct order. -- Hannes ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands" 2008-06-12 12:16 ` Yves Orton 2008-06-12 12:45 ` Johannes Sixt @ 2008-06-12 12:52 ` Michael J Gruber 2008-06-12 14:02 ` Philippe Bruhat (BooK) 1 sibling, 1 reply; 7+ messages in thread From: Michael J Gruber @ 2008-06-12 12:52 UTC (permalink / raw) To: git Yves Orton venit, vidit, dixit 12.06.2008 14:16: > We want a more or less linear repo as the result. This bug with > fast-export was the main showstopper in our efforts. However, I can > imagine that this is a problem that many people will want to solve. It > would be nice if there was an easier way to do it that what we currently > are doing (merging and munging multiple fast-export streams into a > single fast-import process). While at this point its probably academic > any suggestions as to the Best Way to do this would be very much > welcome. I've done something like this, "stitching" the history of different repos together in order to produce one repo, with each of the constituents in a subdir. What I did was an adaption of http://www.kernel.org/pub/software/scm/git/docs/howto/using-merge-subtree.html but as a multistep version: 1. Create an empty repo 2. Add your to-be-stitched repos as remotes, say A B C 3. Create an empty commit 4. "git merge -s ours --no-commit a b c", where a b c are the root commits of A B C 5. "git read-tree --prefix=dir-A/ -u a" and analogously for b c 6. "git commit", use the common commit message of those commits Note that git refuses the merge (4.) into an empty (headless) repo, which is why you need 3. There may be smarter ways. If you don't care about recording the commits as (octopus) merges you can skip 3. and 4. (4. just records merge info in the index). Then, repeat: 3'. remove dir-A etc. (I think I used git-rm, I'm sorry I can't recall). 4. as above (if you want to record as merge) 5. as above 6. as above If not all of A B C appear in every step then make sure to remove only the ones (in 3'.) which you'll update in 5. You have to remove the dir because read-tree wants it like that. I used this for stitching 5 or 6 repos with a short history together, so I repeated these steps manually rather than scripting it; all I needed was a list of SHA1s listing which commits from A B C etc. corresponded to the same "step" in the combined repo. Cheers Michael ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands" 2008-06-12 12:52 ` Michael J Gruber @ 2008-06-12 14:02 ` Philippe Bruhat (BooK) 0 siblings, 0 replies; 7+ messages in thread From: Philippe Bruhat (BooK) @ 2008-06-12 14:02 UTC (permalink / raw) To: git On Thu, Jun 12, 2008 at 02:52:40PM +0200, Michael J Gruber wrote: > Yves Orton venit, vidit, dixit 12.06.2008 14:16: >> We want a more or less linear repo as the result. This bug with >> fast-export was the main showstopper in our efforts. However, I can >> imagine that this is a problem that many people will want to solve. It >> would be nice if there was an easier way to do it that what we currently >> are doing (merging and munging multiple fast-export streams into a >> single fast-import process). While at this point its probably academic >> any suggestions as to the Best Way to do this would be very much >> welcome. > > I've done something like this, "stitching" the history of different > repos together in order to produce one repo, with each of the > constituents in a subdir. What I did was an adaption of > > http://www.kernel.org/pub/software/scm/git/docs/howto/using-merge-subtree.html > > but as a multistep version: What we did with Yves was a script doing the following: - run git fast-export --all (and --topo-order now) on all the repositories we wanted to merge and read blocks from them - pass through all non-commit blocks (munging paths to put the content of each repo in its own directory and renumbering marks to avoid clashes) - keep a list of the next commit sent by fast-export for each repo - select the oldest commit, and send it through, after stitching in the right place (the point being to determine the "right place") Actually, what we are trying to do is produce a single DAG from 2 or more DAGs, while making sure that each "internal DAG" is the same. (I'm pretty sure this is all trivial stuff for graph mathematicians) Imagine we merged repositories A, B and C in a new repo D, if we replace all nodes from D coming from B and C by vertexes, we will end up with the original A graph. We defined the "right place" as so: when having selected the next commit to add to our new graph, each of its new parents is defined by "the last alien child of the original parent" (or the original parent itself). For example, if our new repository being built looks like: --A7--A8--B4--B5--A9--B7 \ --B6 In this case, A9 was originally attached to A8, but to avoid unnecessary branching in the new repo, we didn't attach it to A8, but to B5 (last alien child of A8, descending the tree in a leftmost manner). No A node will ever be attached to B6. The next A node originally attached to A8 will be attached to B5 again, and one originally attached to A9 will be attached to B7. Like this: --A10 / --A7--A8--B4--B5--A9--B7--A11 \ --B6 Now, if we remove all B nodes, we get this: --A10 / --A7--A8--A9--A11 which is the original A graph. Finding the "last alien child" works fine with merges, too. Of course, some commits from A might end up on an unrelated branch of B, but all B branches are irrelevant to A anyway! :-) -- Philippe Bruhat (BooK) People are all unique- but some are more unique than others. (Moral from Groo The Wanderer #22 (Epic)) ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-06-12 14:03 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-06-12 10:21 git-fast-export bug, commits emmitted in incorrect order causing parent data to be lost from commits turning essentially linear repo into "islands" Yves Orton 2008-06-12 11:53 ` Johannes Sixt 2008-06-12 12:04 ` Yves Orton 2008-06-12 12:16 ` Yves Orton 2008-06-12 12:45 ` Johannes Sixt 2008-06-12 12:52 ` Michael J Gruber 2008-06-12 14:02 ` Philippe Bruhat (BooK)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).