* [PATCH] Add a test for a problem in "rebase --whitespace=fix" @ 2010-02-07 8:10 Björn Gustavsson 2010-02-07 18:38 ` Junio C Hamano 0 siblings, 1 reply; 7+ messages in thread From: Björn Gustavsson @ 2010-02-07 8:10 UTC (permalink / raw) To: git; +Cc: Junio C Hamano The command "git rebase --whitespace=fix HEAD~<N>" is supposed to only clean up trailing whitespace, and the expectation is that it cannot fail. Unfortunately, if one commit adds a blank line at the end of a file and a subsequent commit adds more non-blank lines after the blank line, "git apply" (used indirectly by "git rebase") will fail to apply the patch of the second commit. Signed-off-by: Björn Gustavsson <bgustavsson@gmail.com> --- t/t3417-rebase-whitespace-fix.sh | 45 ++++++++++++++++++++++++++++++++++++++ 1 files changed, 45 insertions(+), 0 deletions(-) create mode 100755 t/t3417-rebase-whitespace-fix.sh diff --git a/t/t3417-rebase-whitespace-fix.sh b/t/t3417-rebase-whitespace-fix.sh new file mode 100755 index 0000000..55cbce7 --- /dev/null +++ b/t/t3417-rebase-whitespace-fix.sh @@ -0,0 +1,45 @@ +#!/bin/sh + +test_description='git rebase --whitespace=fix + +This test runs git rebase --whitespace=fix and make sure that it works. +' + +. ./test-lib.sh + +# prepare initial revision of "file" with a blank line at the end +cat >file <<EOF +a +b +c + +EOF + +# expected contents in "file" after rebase +cat >expect <<EOF +a +b +c +EOF + +# prepare second revision of "file" +cat >second <<EOF +a +b +c + +d +e +f +EOF + +test_expect_failure 'blanks line at end of file; extend at end of file' ' + git commit --allow-empty -m "Initial empty commit" && + git add file && git commit -m first && + mv second file && + git add file && git commit -m second && + git rebase --whitespace=fix HEAD^^ && + git diff --exit-code HEAD^:file expect +' + +test_done -- 1.7.0.rc1.46.g04bf4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Add a test for a problem in "rebase --whitespace=fix" 2010-02-07 8:10 [PATCH] Add a test for a problem in "rebase --whitespace=fix" Björn Gustavsson @ 2010-02-07 18:38 ` Junio C Hamano 2010-02-07 22:44 ` Björn Gustavsson 0 siblings, 1 reply; 7+ messages in thread From: Junio C Hamano @ 2010-02-07 18:38 UTC (permalink / raw) To: Björn Gustavsson; +Cc: git Björn Gustavsson <bgustavsson@gmail.com> writes: > The command "git rebase --whitespace=fix HEAD~<N>" is supposed to > only clean up trailing whitespace, and the expectation is that it > cannot fail. I don't know if the expectation is sound to begin with, for exactly the reason you mention below. > Unfortunately, if one commit adds a blank line at the end of a file > and a subsequent commit adds more non-blank lines after the blank > line,... First, is this a condition that we want to change the behaviour to "succeed" later? Imagine that the gap between abc and def block in your example is much larger to exceed the number of pre-context lines of your second patch (usually 3), and imagine you are the "git apply --whitespace=fix" program you have updated to "fix" the preceived problem. You know you earlier might have stripped some blank lines at the EOF, but there is nothing that tells you if you had only 3 blank lines, or you had even more. How many blank lines will you be adding back? I think one fundamental difference between stripping of blanks at EOL and blanks at EOF is that the former, even after applying an earlier patch with the whitespace fix, could be fudged to an unspecified number of whitespaces while you are applying the second one, exactly because you will strip them out from the output of the second one anyway. But the latter will have to _appear_ in the result, as you have demonstrated, as a gap between abc and def blocks in your example. Simply there is not enough information to do so. Around 1.6.6/1.6.5.3 timeframe, we have separated blank-at-{eol,eof} out of trailing-space to allow users to keep the traling blank lines. Perhaps you could demonstrate what is expected to work (and not bothering with what is not ever expected to work) by changing the test like this. I added one "trailing whitespace at EOL" example to keep the "strip trailing whitespace" part working, by the way. t/t3417-rebase-whitespace-fix.sh | 11 +++++++---- 1 files changed, 7 insertions(+), 4 deletions(-) diff --git a/t/t3417-rebase-whitespace-fix.sh b/t/t3417-rebase-whitespace-fix.sh index 55cbce7..0e366f8 100755 --- a/t/t3417-rebase-whitespace-fix.sh +++ b/t/t3417-rebase-whitespace-fix.sh @@ -7,9 +7,9 @@ This test runs git rebase --whitespace=fix and make sure that it works. . ./test-lib.sh -# prepare initial revision of "file" with a blank line at the end -cat >file <<EOF -a +# prepare initial revision of "file" with a trailing blank and a blank line at the end +sed -e 's/Z//' >file <<\EOF +a Z b c @@ -20,6 +20,7 @@ cat >expect <<EOF a b c + EOF # prepare second revision of "file" @@ -33,7 +34,9 @@ e f EOF -test_expect_failure 'blanks line at end of file; extend at end of file' ' +test_expect_success 'keep blanks line at end of file' ' + git config core.whitespace -blank-at-eof && + git commit --allow-empty -m "Initial empty commit" && git add file && git commit -m first && mv second file && ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Add a test for a problem in "rebase --whitespace=fix" 2010-02-07 18:38 ` Junio C Hamano @ 2010-02-07 22:44 ` Björn Gustavsson 2010-02-08 0:15 ` Junio C Hamano 0 siblings, 1 reply; 7+ messages in thread From: Björn Gustavsson @ 2010-02-07 22:44 UTC (permalink / raw) To: Junio C Hamano; +Cc: git 2010/2/7 Junio C Hamano <gitster@pobox.com>: > First, is this a condition that we want to change the behaviour to > "succeed" later? Yes, assuming it is possible to fix. > Imagine that the gap between abc and def block in your example is much > larger to exceed the number of pre-context lines of your second patch > (usually 3), and imagine you are the "git apply --whitespace=fix" program > you have updated to "fix" the preceived problem. You know you earlier > might have stripped some blank lines at the EOF, but there is nothing that > tells you if you had only 3 blank lines, or you had even more. How many > blank lines will you be adding back? My original idea was to add back exactly the number of lines needed so that the context lines would match. That can be calculated from the line numbers of the last line of the pre-image and the line number in the chunk and by scanning the chunk for blank context lines (both at the beginning and end of chunk). Since the blanks lines at the end will be stripped away anyway, I doesn't matter if I add back fewer lines than were there originally. Thinking a little more about it, if there is a chunk that starts beyond the end of file, I could add the number of blanks lines that is missing up to the beginning of the chunk plus the number of lines in the chunk itself. That will also take care of the case that the chunk deletes blanks lines. (That will probably add too many blanks at the end, but they will be stripped out again.) That is my plan. Of course, since I have not attempted to implement it yet, there could be fatal flaws in it. Do you see any fatal flaws that I don't see? -- Björn Gustavsson, Erlang/OTP, Ericsson AB ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Add a test for a problem in "rebase --whitespace=fix" 2010-02-07 22:44 ` Björn Gustavsson @ 2010-02-08 0:15 ` Junio C Hamano 2010-02-08 7:37 ` Björn Gustavsson 0 siblings, 1 reply; 7+ messages in thread From: Junio C Hamano @ 2010-02-08 0:15 UTC (permalink / raw) To: Björn Gustavsson; +Cc: git Björn Gustavsson <bgustavsson@gmail.com> writes: > 2010/2/7 Junio C Hamano <gitster@pobox.com>: > >> First, is this a condition that we want to change the behaviour to >> "succeed" later? > > Yes, assuming it is possible to fix. > >> Imagine that the gap between abc and def block in your example is much >> larger to exceed the number of pre-context lines of your second patch >> (usually 3), and imagine you are the "git apply --whitespace=fix" program >> you have updated to "fix" the preceived problem. You know you earlier >> might have stripped some blank lines at the EOF, but there is nothing that >> tells you if you had only 3 blank lines, or you had even more. How many >> blank lines will you be adding back? > > My original idea was to add back exactly the number of lines needed > so that the context lines would match. That can be calculated from > the line numbers of the last line of the pre-image and the line number > in the chunk and by scanning the chunk for blank context lines > (both at the beginning and end of chunk). Since the blanks lines > at the end will be stripped away anyway, I doesn't matter if I add > back fewer lines than were there originally. And if it were in the middle like your patch had? Suppose the first patch in your example ended with 10 blank lines instead of just one. You apply it with --ws=fix and end up with 3-liner file with a/b/c. The sender of that patch then builds on top of his copy (still with 10 blanks at EOF). Perhaps the early part a/b/c might be changed to a/b/c/1/2 or something. And on top of that change, he adds new text at the end of the file (after those 10 blank lines) with another patch, like your example added d/e/f at the end after the gap. The sender then chooses to cherry pick and gives you only the last patch, to add d/e/f, for whatever reason. The change in the earlier part to add 1/2 after a/b/c was not suitable for public consumption yet, perhaps. The patch comes with 3 pre-context lines as usual, what you see begins with three blank lines, and has an addition of d/e/f. You have already stripped the blank lines at the end when you applied the original one; you do not know how many blank lines at the end you lost when you applied it (and you do not _want_ to record that when applying either). You cannot go by the line numbers on the "@@ -l,k +m,n @@" header line you see in the second patch you received. On that line, only k and n are reliable numbers (the must match the patch text). l and m are unreliable; being able to apply even if the text you have at hand does not exactly match l and m is the whole point of transmitting the change in the patch format. The _only_ information you have usable at that point is that there are _at least_ 3 blank lines before the addition, and perhaps the fact that the hunk ends without post context lines. The latter tells you that it must apply at the end, but still doesn't tell you how many blanks you need to add back at EOF before applying the patch. > Do you see any fatal flaws that I don't see? I don't know---the above is what I already said in my previous message and it already looked fatal enough to me. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Add a test for a problem in "rebase --whitespace=fix" 2010-02-08 0:15 ` Junio C Hamano @ 2010-02-08 7:37 ` Björn Gustavsson 2010-02-09 21:58 ` Junio C Hamano 0 siblings, 1 reply; 7+ messages in thread From: Björn Gustavsson @ 2010-02-08 7:37 UTC (permalink / raw) To: Junio C Hamano; +Cc: git 2010/2/8 Junio C Hamano <gitster@pobox.com>: > You cannot go by the line numbers on the "@@ -l,k +m,n @@" header line you > see in the second patch you received. On that line, only k and n are > reliable numbers (the must match the patch text). l and m are unreliable; > being able to apply even if the text you have at hand does not exactly > match l and m is the whole point of transmitting the change in the patch > format. The _only_ information you have usable at that point is that > there are _at least_ 3 blank lines before the addition, and perhaps the > fact that the hunk ends without post context lines. The latter tells you > that it must apply at the end, but still doesn't tell you how many blanks > you need to add back at EOF before applying the patch. I agree. The information is not enough if you apply one patch at the time. But my usage case that my test tries to demonstrate is different: I already have a number of commits in my repository (received either by pulling or applying a whole series of patches at once). I then do, for example: git rebase --whitespace=fix HEAD~4 to clean up the existing commits. That rebase uses "git apply" internally seems like an implementation detail that I as a user of rebase don't care about. I just expect it to work. I see at least two possible ways to implement that: 1. Have "git rebase" give "git apply" a special option so that it will apply patches beyond the end of file (and trusting the line numbers in the chunks). 2. Having "git rebase" remember the number of blanks line that was removed in each previous file in previous fixed commits and add them back just before invoking "git apply". It is possible that it is too much work to implement it to be worthwhile (especially solution 2), but I do think it is possible. If you don't agree, fair enough. In that case I will hold on to the test case and only re-submit it if I can also include a fix. -- Björn Gustavsson, Erlang/OTP, Ericsson AB ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Add a test for a problem in "rebase --whitespace=fix" 2010-02-08 7:37 ` Björn Gustavsson @ 2010-02-09 21:58 ` Junio C Hamano 2010-02-10 20:20 ` Björn Gustavsson 0 siblings, 1 reply; 7+ messages in thread From: Junio C Hamano @ 2010-02-09 21:58 UTC (permalink / raw) To: Björn Gustavsson; +Cc: git Björn Gustavsson <bgustavsson@gmail.com> writes: > I see at least two possible ways to implement that: > > 1. Have "git rebase" give "git apply" a special option so that it > will apply patches beyond the end of file (and trusting the > line numbers in the chunks). > > 2. Having "git rebase" remember the number of blanks line that > was removed in each previous file in previous fixed commits > and add them back just before invoking "git apply". I actually changed my mind after thinking about it a bit more. I think you should be able to do the same thing as we already do in git-apply inside match_fragment(), and without involving any "rebase" specific hacks nor options, the result will cover most of the real-life use cases. The existing logic in the function says "We were told to find a location in the img (i.e. the text being patched) where preimage appears. We will replace the copy of the preimage in img with the corresponding postimage. Unfortunately, the preimage does not exactly match, but if we consider that we may have applied earlier patches with whitespace=fix, we can see that the given preimage matches with this location in our img except for whitespace differences." When it finds the place that "fuzzily" matches the preimage, it adjusts the given preimage to match what we have (i.e. pretends that the patch sender sent a patch based on a version of the text with whitespace fixes we made in ours already applied), and then let the common logic replace that copy of preimage with postimage in img. The hunk is applied successfully, and we are happy. Using the same logic, your "a/b/c trail" -> "a/b/c gap d/e/f" example would work like this. After applying the first patch with blank-at-eof, we will have "a/b/c" (no trailing blanks) in img. The second patch comes. It looks like this: @@ -l,3 +k,6 @@ b c +d +e +f This tells you to match "'b c blank' at the end" (lack of trailing context would trigger match_end hint, I think). So you go ahead and look at your img, which is "a/b/c" (no trailing blanks). It does not match. Just like existing logic in match_fragment() knows that the img might have undergone whitespace fixes, your new logic would realize that with an addition of one empty line to make your img "a/b/c blank", the preimage of this hunk matches the way as you are told to search. At that point, you adjust the preimage to match what we have (the logic is exactly the same as blank-at-eol case---adjust the patch to pretend that they started with a whitespace-corrected version we have). This would make the hunk look like this: @@ -l,2 +k,6 @@ b c + +d +e +f and by applying that adjusted hunk, you will get the expected result, namely, "a/b/c gap d/e/f". Notice that we did all that without ever looking at 'l' and 'k' (hunk offsets)? I'd like to see this logic (and only this logic, without relying on the diff hunk offset numbers at all) done first, because it is very much in line with what we already do, and more importantly, because it is a more general solution that is applicable outside the context of rebase. This of course will not catch the case where you used to have added tons of blanks at the end in the earlier patch (and losing sight of how many blanks you removed while applying it). You would need to build a special case that probably relies on the diff hunk offset, and trigger that additional logic in rebase (i.e. the caller that _knows_ the hunk offsets are reliable). That special case may not involve a change to "git apply" at all, as you suggested as an alternative (2), as you can do that all inside "git rebase". But I'd rather like to see a solution that does not rely on the special case as one patch that is independent from rebase, and the special case built on top of it, as a separate patch. After all, if you _are_ applying with whitespace=fix and blank-at-eof is in effect, it is very likely that the nature of the contents in that path is something that blank lines in the middle does not matter except for readability (e.g. something like .git/config file format); the fact that it is safe to strip the blank-at-eof strongly suggests that blanks do not have semantic meaning and are there purely for readability. In contents of such a nature, it would not matter if you lost indeterminate number of blank lines in the middle by applying two patches, the first one leaving 100 blank lines at the end and then the second one adding some non-blank lines at the end. It might even turn out to be a moot point to worry about the special case hack if that is the case. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Add a test for a problem in "rebase --whitespace=fix" 2010-02-09 21:58 ` Junio C Hamano @ 2010-02-10 20:20 ` Björn Gustavsson 0 siblings, 0 replies; 7+ messages in thread From: Björn Gustavsson @ 2010-02-10 20:20 UTC (permalink / raw) To: Junio C Hamano; +Cc: git 2010/2/9 Junio C Hamano <gitster@pobox.com>: > I actually changed my mind after thinking about it a bit more. Thanks for thinking more about it and for your explanation about the algorithm I am quite busy at the moment, but I hope to get started on a patch series after the end of February. > I'd like to see this logic (and only this logic, without relying on the > diff hunk offset numbers at all) done first, because it is very much in > line with what we already do, and more importantly, because it is a more > general solution that is applicable outside the context of rebase. I'll try to implement that logic. It seems that it should cover 99% of all real-world use cases. I will not worry about the general case (tons of blanks) until I have got the first patch working. -- Björn Gustavsson, Erlang/OTP, Ericsson AB ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-02-10 20:20 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-02-07 8:10 [PATCH] Add a test for a problem in "rebase --whitespace=fix" Björn Gustavsson 2010-02-07 18:38 ` Junio C Hamano 2010-02-07 22:44 ` Björn Gustavsson 2010-02-08 0:15 ` Junio C Hamano 2010-02-08 7:37 ` Björn Gustavsson 2010-02-09 21:58 ` Junio C Hamano 2010-02-10 20:20 ` Björn Gustavsson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).