* Export from bzr / Import to git results in a deleted file re-appearing @ 2012-07-12 18:00 Felix Natter 2012-07-12 21:01 ` Jeff King 0 siblings, 1 reply; 9+ messages in thread From: Felix Natter @ 2012-07-12 18:00 UTC (permalink / raw) To: git hi, I am trying to move freeplane's repository (GPL-project) from bzr to git, but when I do this: $ mkdir freeplane-git1 $ cd freeplane-git1 $ git init . $ bzr fast-export --export-marks=../marks.bzr ../trunk/ | git fast-import --export-marks=../marks.git $ git checkout then there are no errors, but the resulting working index is broken: freeplane-git1/freeplane_plugin_formula/src/org/freeplane/plugin/formula contains SpreadSheetUtils.java which belongs to package 'org.freeplane.plugin.spreadsheet' and which is no longer in the bzr trunk that I imported! The freeplane repo is publically available, so you should easily be able to reproduce this: bzr branch bzr://freeplane.bzr.sourceforge.net/bzrroot/freeplane/freeplane_program/trunk I reported this as a possible bzr-fastimport bug, but didn't get a response so far: https://bugs.launchpad.net/bzr-fastimport/+bug/1014291 Is there maybe a better way to convert a repo from bzr to git? This seems to be the standard solution on the web. I am using Bazaar (bzr) 2.5.0dev6, bzr-fastimport 0.13.0 and git 1.7.10.4. Thanks and Best Regards! -- Felix Natter ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Export from bzr / Import to git results in a deleted file re-appearing 2012-07-12 18:00 Export from bzr / Import to git results in a deleted file re-appearing Felix Natter @ 2012-07-12 21:01 ` Jeff King 2012-07-13 9:04 ` Andreas Schwab 0 siblings, 1 reply; 9+ messages in thread From: Jeff King @ 2012-07-12 21:01 UTC (permalink / raw) To: Felix Natter; +Cc: git On Thu, Jul 12, 2012 at 08:00:17PM +0200, Felix Natter wrote: > I am trying to move freeplane's repository (GPL-project) from bzr to > git, but when I do this: > > $ mkdir freeplane-git1 > $ cd freeplane-git1 > $ git init . > $ bzr fast-export --export-marks=../marks.bzr ../trunk/ | git fast-import --export-marks=../marks.git > $ git checkout > > then there are no errors, but the resulting working index is broken: > freeplane-git1/freeplane_plugin_formula/src/org/freeplane/plugin/formula > contains SpreadSheetUtils.java which belongs to package > 'org.freeplane.plugin.spreadsheet' and which is no longer in the bzr > trunk that I imported! If you run only the bzr half of your command and inspect the output, you will see that the file in question is mentioned twice. Once in a commit on "refs/heads/master" that renames into it from another file: R freeplane_plugin_spreadsheet/src/org/freeplane/plugin/spreadsheet/SpreadSheetUtils.java freeplane_plugin_formula/src/org/freeplane/plugin/formula/SpreadSheetUtils.java and another similar case creating a merge commit. But it is never deleted. So from a cursory inspection, it looks like git is right to include the file in the final output (but I didn't trace the complete history graph, so it's possible that those commits are somehow not used in the final result). Which leads me to believe that the bug is in bzr. -Peff ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Export from bzr / Import to git results in a deleted file re-appearing 2012-07-12 21:01 ` Jeff King @ 2012-07-13 9:04 ` Andreas Schwab 2012-07-13 13:02 ` Jeff King 0 siblings, 1 reply; 9+ messages in thread From: Andreas Schwab @ 2012-07-13 9:04 UTC (permalink / raw) To: Jeff King; +Cc: Felix Natter, git Jeff King <peff@peff.net> writes: > On Thu, Jul 12, 2012 at 08:00:17PM +0200, Felix Natter wrote: > >> I am trying to move freeplane's repository (GPL-project) from bzr to >> git, but when I do this: >> >> $ mkdir freeplane-git1 >> $ cd freeplane-git1 >> $ git init . >> $ bzr fast-export --export-marks=../marks.bzr ../trunk/ | git fast-import --export-marks=../marks.git >> $ git checkout >> >> then there are no errors, but the resulting working index is broken: >> freeplane-git1/freeplane_plugin_formula/src/org/freeplane/plugin/formula >> contains SpreadSheetUtils.java which belongs to package >> 'org.freeplane.plugin.spreadsheet' and which is no longer in the bzr >> trunk that I imported! > > If you run only the bzr half of your command and inspect the output, you > will see that the file in question is mentioned twice. Once in a commit > on "refs/heads/master" that renames into it from another file: > > R freeplane_plugin_spreadsheet/src/org/freeplane/plugin/spreadsheet/SpreadSheetUtils.java > freeplane_plugin_formula/src/org/freeplane/plugin/formula/SpreadSheetUtils.java That same revision also removes it, but is uses the original name for the deletion (the bzr revision actually renames the containing directory). That's probably what confuses git fast-import. Here is a test case: bzr init mkdir a bzr add a touch a/b bzr add a/b bzr ci -m a bzr mv a b bzr rm b/b bzr ci -m b bzr fast-export . The output contains these lines: R a/b b/b D a/b Changing the second line to D b/b fixes the bug. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Export from bzr / Import to git results in a deleted file re-appearing 2012-07-13 9:04 ` Andreas Schwab @ 2012-07-13 13:02 ` Jeff King 2012-07-13 13:39 ` Andreas Schwab 0 siblings, 1 reply; 9+ messages in thread From: Jeff King @ 2012-07-13 13:02 UTC (permalink / raw) To: Andreas Schwab; +Cc: Felix Natter, git On Fri, Jul 13, 2012 at 11:04:21AM +0200, Andreas Schwab wrote: > > If you run only the bzr half of your command and inspect the output, you > > will see that the file in question is mentioned twice. Once in a commit > > on "refs/heads/master" that renames into it from another file: > > > > R freeplane_plugin_spreadsheet/src/org/freeplane/plugin/spreadsheet/SpreadSheetUtils.java > > freeplane_plugin_formula/src/org/freeplane/plugin/formula/SpreadSheetUtils.java > > That same revision also removes it, but is uses the original name for > the deletion (the bzr revision actually renames the containing > directory). That's probably what confuses git fast-import. > [...] > The output contains these lines: > > R a/b b/b > D a/b > > Changing the second line to D b/b fixes the bug. Yeah, I agree that is problematic. But I do not think it is a fast-import bug, but rather bogus output generated by bzr fast-export (I am not clear from what you wrote above if you are considering it a bug that fast-import is confused). It seems nonsensical to mention a file both as a rename source and as deleted in the same revision, and certainly I would not expect an importer to deduce a link between the second line and b/b. -Peff ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Export from bzr / Import to git results in a deleted file re-appearing 2012-07-13 13:02 ` Jeff King @ 2012-07-13 13:39 ` Andreas Schwab 2012-07-14 14:33 ` Felix Natter 2012-07-15 10:23 ` [PATCH] fast-import: catch deletion of non-existent file in input Jeff King 0 siblings, 2 replies; 9+ messages in thread From: Andreas Schwab @ 2012-07-13 13:39 UTC (permalink / raw) To: Jeff King; +Cc: Felix Natter, git Jeff King <peff@peff.net> writes: > On Fri, Jul 13, 2012 at 11:04:21AM +0200, Andreas Schwab wrote: > >> > If you run only the bzr half of your command and inspect the output, you >> > will see that the file in question is mentioned twice. Once in a commit >> > on "refs/heads/master" that renames into it from another file: >> > >> > R freeplane_plugin_spreadsheet/src/org/freeplane/plugin/spreadsheet/SpreadSheetUtils.java >> > freeplane_plugin_formula/src/org/freeplane/plugin/formula/SpreadSheetUtils.java >> >> That same revision also removes it, but is uses the original name for >> the deletion (the bzr revision actually renames the containing >> directory). That's probably what confuses git fast-import. >> [...] >> The output contains these lines: >> >> R a/b b/b >> D a/b >> >> Changing the second line to D b/b fixes the bug. > > Yeah, I agree that is problematic. But I do not think it is a > fast-import bug, but rather bogus output generated by bzr fast-export (I > am not clear from what you wrote above if you are considering it a bug > that fast-import is confused). It seems nonsensical to mention a file > both as a rename source and as deleted in the same revision, and > certainly I would not expect an importer to deduce a link between the > second line and b/b. IMHO fast-import should raise an error in this case, like it does when you switch the lines. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Export from bzr / Import to git results in a deleted file re-appearing 2012-07-13 13:39 ` Andreas Schwab @ 2012-07-14 14:33 ` Felix Natter 2012-07-15 10:23 ` [PATCH] fast-import: catch deletion of non-existent file in input Jeff King 1 sibling, 0 replies; 9+ messages in thread From: Felix Natter @ 2012-07-14 14:33 UTC (permalink / raw) To: git Andreas Schwab <schwab@linux-m68k.org> writes: > Jeff King <peff@peff.net> writes: > >> On Fri, Jul 13, 2012 at 11:04:21AM +0200, Andreas Schwab wrote: >> >>> > If you run only the bzr half of your command and inspect the output, you >>> > will see that the file in question is mentioned twice. Once in a commit >>> > on "refs/heads/master" that renames into it from another file: >>> > >>> > R freeplane_plugin_spreadsheet/src/org/freeplane/plugin/spreadsheet/SpreadSheetUtils.java >>> > freeplane_plugin_formula/src/org/freeplane/plugin/formula/SpreadSheetUtils.java >>> >>> That same revision also removes it, but is uses the original name for >>> the deletion (the bzr revision actually renames the containing >>> directory). That's probably what confuses git fast-import. >>> [...] >>> The output contains these lines: >>> >>> R a/b b/b >>> D a/b >>> >>> Changing the second line to D b/b fixes the bug. >> >> Yeah, I agree that is problematic. But I do not think it is a >> fast-import bug, but rather bogus output generated by bzr fast-export (I >> am not clear from what you wrote above if you are considering it a bug >> that fast-import is confused). It seems nonsensical to mention a file >> both as a rename source and as deleted in the same revision, and >> certainly I would not expect an importer to deduce a link between the >> second line and b/b. > > IMHO fast-import should raise an error in this case, like it does when > you switch the lines. +1. Thanks very much for fixing bzr-fastexport! Also many thanks to Jeff for the analysis! Best Regards, -- Felix Natter ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] fast-import: catch deletion of non-existent file in input 2012-07-13 13:39 ` Andreas Schwab 2012-07-14 14:33 ` Felix Natter @ 2012-07-15 10:23 ` Jeff King 2012-07-15 18:11 ` Jonathan Nieder 1 sibling, 1 reply; 9+ messages in thread From: Jeff King @ 2012-07-15 10:23 UTC (permalink / raw) To: Andreas Schwab; +Cc: Shawn O. Pearce, David Barr, Felix Natter, git On Fri, Jul 13, 2012 at 03:39:50PM +0200, Andreas Schwab wrote: > >> The output contains these lines: > >> > >> R a/b b/b > >> D a/b > >> > >> Changing the second line to D b/b fixes the bug. > > > > Yeah, I agree that is problematic. But I do not think it is a > > fast-import bug, but rather bogus output generated by bzr fast-export (I > > am not clear from what you wrote above if you are considering it a bug > > that fast-import is confused). It seems nonsensical to mention a file > > both as a rename source and as deleted in the same revision, and > > certainly I would not expect an importer to deduce a link between the > > second line and b/b. > > IMHO fast-import should raise an error in this case, like it does when > you switch the lines. Here's a patch to do so. It means that fast-import is a little more strict about deletions than it used to be. I think it's probably a good idea for fast-import to err on the side of strictness (since erring on the other side may mean corrupt input). I don't know if the omission of error checking in the original was a philosophical choice or just random. :) -- >8 -- Subject: fast-import: catch deletion of non-existent file in input Fast-import does not do a lot of input verification; in particular, you can mark a path as deleted by a commit even if that files does not exist in the parent at all. On the one hand, it is nice for fast-import to be liberal in what it accepts, as it means we are more forgiving of exporter bugs or of people slicing and dicing the input. On the other hand, this can mask bugs in exporters, leading to a subtly wrong import result. In this case, bzr's fast-export generated a sequence like: R foo bar D foo when it meant to say: R foo bar D bar We silently ignored the bogus "D foo" directive, and the resulting tree incorrectly contained "bar". With this patch, we notice the bogus input and die. This patch adds a new test script specifically for checking bogus inputs to fast-import. We also need to tweak t9300, which appeared to be accidentally deleting a non-existent path in an otherwise unrelated test. Signed-off-by: Jeff King <peff@peff.net> --- The tweak in t9300 looks like the "dst2" entry was just leftover cruft from writing the test; there is no dst2 mentioned anywhere else. It comes from David's 8dc6a37; pleae correct me if my assumptions isn't correct. fast-import.c | 5 ++- t/t9300-fast-import.sh | 1 - t/t9302-fast-import-bogus.sh | 77 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 81 insertions(+), 2 deletions(-) create mode 100755 t/t9302-fast-import-bogus.sh diff --git a/fast-import.c b/fast-import.c index eed97c8..779adfc 100644 --- a/fast-import.c +++ b/fast-import.c @@ -2367,6 +2367,7 @@ static void file_change_d(struct branch *b) const char *p = command_buf.buf + 2; static struct strbuf uq = STRBUF_INIT; const char *endp; + struct tree_entry leaf; strbuf_reset(&uq); if (!unquote_c_style(&uq, p, &endp)) { @@ -2374,7 +2375,9 @@ static void file_change_d(struct branch *b) die("Garbage after path in: %s", command_buf.buf); p = uq.buf; } - tree_content_remove(&b->branch_tree, p, NULL); + tree_content_remove(&b->branch_tree, p, &leaf); + if (!leaf.versions[1].mode) + die("Path %s not in branch", p); } static void file_change_cr(struct branch *b, int rename) diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index 2fcf269..7624ba5 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -1203,7 +1203,6 @@ test_expect_success PIPE 'N: empty directory reads as missing' ' printf "%s\n" "$line" >response && cat <<-\EOF D dst1 - D dst2 EOF ) | git fast-import --cat-blob-fd=3 3>backflow && diff --git a/t/t9302-fast-import-bogus.sh b/t/t9302-fast-import-bogus.sh new file mode 100755 index 0000000..95c5196 --- /dev/null +++ b/t/t9302-fast-import-bogus.sh @@ -0,0 +1,77 @@ +#!/bin/sh + +test_description='test that fast-import handles bogus input correctly' +. ./test-lib.sh + +# A few shorthands to make writing sample input easier +counter=0 +mark() { + counter=$(( $counter + 1)) && + echo "mark :$counter" +} + +blob() { + echo blob && + mark && + cat <<-\EOF + data 4 + foo + + EOF +} + +ident() { + test_tick && + echo "author $GIT_AUTHOR_NAME <$GIT_AUTHOR_EMAIL> $GIT_AUTHOR_DATE" + echo "committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE" +} + +commit() { + echo "commit refs/heads/master" && + mark && + ident && + cat <<-\EOF + data 8 + message + EOF +} + +root() { + blob && + m=$counter && + echo "reset refs/heads/master" && + commit && + echo "M 100644 :$m file" && + echo +} + +test_expect_success 'deleting a nonexistent file is an error' ' + { + root && + commit && + echo "D does-not-exist" + } >input && + test_must_fail git fast-import --force <input +' + +test_expect_success 'renaming a deleted file is an error' ' + { + root && + commit && + echo "D file" && + echo "R file dest" + } >input && + test_must_fail git fast-import --force <input +' + +test_expect_success 'deleting a renamed file is an error' ' + { + root && + commit && + echo "R file dest" && + echo "D file" + } >input && + test_must_fail git fast-import --force <input +' + +test_done -- 1.7.10.5.40.gbbc17de ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] fast-import: catch deletion of non-existent file in input 2012-07-15 10:23 ` [PATCH] fast-import: catch deletion of non-existent file in input Jeff King @ 2012-07-15 18:11 ` Jonathan Nieder 2012-07-16 0:26 ` Jeff King 0 siblings, 1 reply; 9+ messages in thread From: Jonathan Nieder @ 2012-07-15 18:11 UTC (permalink / raw) To: Jeff King; +Cc: Andreas Schwab, Shawn O. Pearce, David Barr, Felix Natter, git Hi, Jeff King wrote: > Subject: fast-import: catch deletion of non-existent file in input [...] > We silently ignored the bogus "D foo" directive, and the > resulting tree incorrectly contained "bar". With this patch, > we notice the bogus input and die. This breaks svn-fe, which relies on the existing semantics when asked to copy an empty directory. That's my fault because we never check that in the testsuite, but I also wouldn't be surprised if other importers were relying on the same thing. Any API break this big without a justification along the lines We can be confident that no existing importer uses this construct because ... _needs_ to be guarded by a new "feature" to be safe for existing importers. Let's repeat that for emphasis: API breaks in fast-import not guarded with a new "feature" type are not ok. Sorry, Jonathan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] fast-import: catch deletion of non-existent file in input 2012-07-15 18:11 ` Jonathan Nieder @ 2012-07-16 0:26 ` Jeff King 0 siblings, 0 replies; 9+ messages in thread From: Jeff King @ 2012-07-16 0:26 UTC (permalink / raw) To: Jonathan Nieder Cc: Andreas Schwab, Shawn O. Pearce, David Barr, Felix Natter, git On Sun, Jul 15, 2012 at 01:11:51PM -0500, Jonathan Nieder wrote: > > Subject: fast-import: catch deletion of non-existent file in input > [...] > > We silently ignored the bogus "D foo" directive, and the > > resulting tree incorrectly contained "bar". With this patch, > > we notice the bogus input and die. > > This breaks svn-fe, which relies on the existing semantics when asked > to copy an empty directory. Thanks for the report. I had a worry while writing this that somebody was relying on the behavior. Let's just drop it, then. It's nice to catch errors in exporters, but not at the expense of compatibility issues. We could introduce a new feature bit, but I'm not sure it is really worthwhile. The older versions of bzr-fast-export would not set the bit anyway, and newer versions are already fixed, so it is kind of closing the barn door after the horse has left (we might catch other bugs, but this one is kind of oddly specific; if somebody wanted to audit fast-import for other similar cases and introduce a "strict" feature bit, that might be worthwhile. But for this single change, I don't think so). > Let's repeat that for emphasis: API breaks in fast-import not guarded > with a new "feature" type are not ok. Totally agree. The question in my mind was whether this was a bug fix or an API change, and it sounds like it is too far towards the latter. -Peff ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-07-16 0:27 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-07-12 18:00 Export from bzr / Import to git results in a deleted file re-appearing Felix Natter 2012-07-12 21:01 ` Jeff King 2012-07-13 9:04 ` Andreas Schwab 2012-07-13 13:02 ` Jeff King 2012-07-13 13:39 ` Andreas Schwab 2012-07-14 14:33 ` Felix Natter 2012-07-15 10:23 ` [PATCH] fast-import: catch deletion of non-existent file in input Jeff King 2012-07-15 18:11 ` Jonathan Nieder 2012-07-16 0:26 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).