git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Add a test for a problem in "rebase --whitespace=fix"
@ 2010-02-07  8:10 Björn Gustavsson
  2010-02-07 18:38 ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Björn Gustavsson @ 2010-02-07  8:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

The command "git rebase --whitespace=fix HEAD~<N>" is supposed to
only clean up trailing whitespace, and the expectation is that it
cannot fail.

Unfortunately, if one commit adds a blank line at the end of a file
and a subsequent commit adds more non-blank lines after the blank
line, "git apply" (used indirectly by "git rebase") will fail to apply
the patch of the second commit.

Signed-off-by: Björn Gustavsson <bgustavsson@gmail.com>
---
 t/t3417-rebase-whitespace-fix.sh |   45 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 45 insertions(+), 0 deletions(-)
 create mode 100755 t/t3417-rebase-whitespace-fix.sh

diff --git a/t/t3417-rebase-whitespace-fix.sh b/t/t3417-rebase-whitespace-fix.sh
new file mode 100755
index 0000000..55cbce7
--- /dev/null
+++ b/t/t3417-rebase-whitespace-fix.sh
@@ -0,0 +1,45 @@
+#!/bin/sh
+
+test_description='git rebase --whitespace=fix
+
+This test runs git rebase --whitespace=fix and make sure that it works.
+'
+
+. ./test-lib.sh
+
+# prepare initial revision of "file" with a blank line at the end
+cat >file <<EOF
+a
+b
+c
+
+EOF
+
+# expected contents in "file" after rebase
+cat >expect <<EOF
+a
+b
+c
+EOF
+
+# prepare second revision of "file"
+cat >second <<EOF
+a
+b
+c
+
+d
+e
+f
+EOF
+
+test_expect_failure 'blanks line at end of file; extend at end of file' '
+	git commit --allow-empty -m "Initial empty commit" &&
+	git add file && git commit -m first &&
+	mv second file &&
+	git add file &&	git commit -m second &&
+	git rebase --whitespace=fix HEAD^^ &&
+	git diff --exit-code HEAD^:file expect
+'
+
+test_done
-- 
1.7.0.rc1.46.g04bf4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add a test for a problem in "rebase --whitespace=fix"
  2010-02-07  8:10 [PATCH] Add a test for a problem in "rebase --whitespace=fix" Björn Gustavsson
@ 2010-02-07 18:38 ` Junio C Hamano
  2010-02-07 22:44   ` Björn Gustavsson
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2010-02-07 18:38 UTC (permalink / raw)
  To: Björn Gustavsson; +Cc: git

Björn Gustavsson <bgustavsson@gmail.com> writes:

> The command "git rebase --whitespace=fix HEAD~<N>" is supposed to
> only clean up trailing whitespace, and the expectation is that it
> cannot fail.

I don't know if the expectation is sound to begin with, for exactly the
reason you mention below.

> Unfortunately, if one commit adds a blank line at the end of a file
> and a subsequent commit adds more non-blank lines after the blank
> line,...

First, is this a condition that we want to change the behaviour to
"succeed" later?  

Imagine that the gap between abc and def block in your example is much
larger to exceed the number of pre-context lines of your second patch
(usually 3), and imagine you are the "git apply --whitespace=fix" program
you have updated to "fix" the preceived problem.  You know you earlier
might have stripped some blank lines at the EOF, but there is nothing that
tells you if you had only 3 blank lines, or you had even more.  How many
blank lines will you be adding back?

I think one fundamental difference between stripping of blanks at EOL and
blanks at EOF is that the former, even after applying an earlier patch
with the whitespace fix, could be fudged to an unspecified number of
whitespaces while you are applying the second one, exactly because you
will strip them out from the output of the second one anyway.  But the
latter will have to _appear_ in the result, as you have demonstrated, as a
gap between abc and def blocks in your example.  Simply there is not
enough information to do so.

Around 1.6.6/1.6.5.3 timeframe, we have separated blank-at-{eol,eof} out
of trailing-space to allow users to keep the traling blank lines.  Perhaps
you could demonstrate what is expected to work (and not bothering with
what is not ever expected to work) by changing the test like this.

I added one "trailing whitespace at EOL" example to keep the "strip
trailing whitespace" part working, by the way.

 t/t3417-rebase-whitespace-fix.sh |   11 +++++++----
 1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/t/t3417-rebase-whitespace-fix.sh b/t/t3417-rebase-whitespace-fix.sh
index 55cbce7..0e366f8 100755
--- a/t/t3417-rebase-whitespace-fix.sh
+++ b/t/t3417-rebase-whitespace-fix.sh
@@ -7,9 +7,9 @@ This test runs git rebase --whitespace=fix and make sure that it works.
 
 . ./test-lib.sh
 
-# prepare initial revision of "file" with a blank line at the end
-cat >file <<EOF
-a
+# prepare initial revision of "file" with a trailing blank and a blank line at the end
+sed -e 's/Z//' >file <<\EOF
+a    Z
 b
 c
 
@@ -20,6 +20,7 @@ cat >expect <<EOF
 a
 b
 c
+
 EOF
 
 # prepare second revision of "file"
@@ -33,7 +34,9 @@ e
 f
 EOF
 
-test_expect_failure 'blanks line at end of file; extend at end of file' '
+test_expect_success 'keep blanks line at end of file' '
+	git config core.whitespace -blank-at-eof &&
+
 	git commit --allow-empty -m "Initial empty commit" &&
 	git add file && git commit -m first &&
 	mv second file &&

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add a test for a problem in "rebase --whitespace=fix"
  2010-02-07 18:38 ` Junio C Hamano
@ 2010-02-07 22:44   ` Björn Gustavsson
  2010-02-08  0:15     ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Björn Gustavsson @ 2010-02-07 22:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

2010/2/7 Junio C Hamano <gitster@pobox.com>:

> First, is this a condition that we want to change the behaviour to
> "succeed" later?

Yes, assuming it is possible to fix.

> Imagine that the gap between abc and def block in your example is much
> larger to exceed the number of pre-context lines of your second patch
> (usually 3), and imagine you are the "git apply --whitespace=fix" program
> you have updated to "fix" the preceived problem.  You know you earlier
> might have stripped some blank lines at the EOF, but there is nothing that
> tells you if you had only 3 blank lines, or you had even more.  How many
> blank lines will you be adding back?

My original idea was to add back exactly the number of lines needed
so that the context lines would match. That can be calculated from
the line numbers of the last line of the pre-image and the line number
in the chunk and by scanning the chunk for blank context lines
(both at the beginning and end of chunk). Since the blanks lines
at the end will be stripped away anyway, I doesn't matter if I add
back fewer lines than were there originally.

Thinking a little more about it, if there is a chunk that starts
beyond the end of file, I could add the number of blanks lines
that is missing up to the beginning of the chunk plus the
number of lines in the chunk itself. That will also take
care of the case that the chunk deletes blanks lines.
(That will probably add too many blanks at the end,
but they will be stripped out again.)

That is my plan. Of course, since I have not attempted
to implement it yet, there could be fatal flaws in it.

Do you see any fatal flaws that I don't see?

-- 
Björn Gustavsson, Erlang/OTP, Ericsson AB

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add a test for a problem in "rebase --whitespace=fix"
  2010-02-07 22:44   ` Björn Gustavsson
@ 2010-02-08  0:15     ` Junio C Hamano
  2010-02-08  7:37       ` Björn Gustavsson
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2010-02-08  0:15 UTC (permalink / raw)
  To: Björn Gustavsson; +Cc: git

Björn Gustavsson <bgustavsson@gmail.com> writes:

> 2010/2/7 Junio C Hamano <gitster@pobox.com>:
>
>> First, is this a condition that we want to change the behaviour to
>> "succeed" later?
>
> Yes, assuming it is possible to fix.
>
>> Imagine that the gap between abc and def block in your example is much
>> larger to exceed the number of pre-context lines of your second patch
>> (usually 3), and imagine you are the "git apply --whitespace=fix" program
>> you have updated to "fix" the preceived problem.  You know you earlier
>> might have stripped some blank lines at the EOF, but there is nothing that
>> tells you if you had only 3 blank lines, or you had even more.  How many
>> blank lines will you be adding back?
>
> My original idea was to add back exactly the number of lines needed
> so that the context lines would match. That can be calculated from
> the line numbers of the last line of the pre-image and the line number
> in the chunk and by scanning the chunk for blank context lines
> (both at the beginning and end of chunk). Since the blanks lines
> at the end will be stripped away anyway, I doesn't matter if I add
> back fewer lines than were there originally.

And if it were in the middle like your patch had?  

Suppose the first patch in your example ended with 10 blank lines instead
of just one.  You apply it with --ws=fix and end up with 3-liner file with
a/b/c.  The sender of that patch then builds on top of his copy (still
with 10 blanks at EOF).  Perhaps the early part a/b/c might be changed to
a/b/c/1/2 or something.  And on top of that change, he adds new text at
the end of the file (after those 10 blank lines) with another patch, like
your example added d/e/f at the end after the gap.

The sender then chooses to cherry pick and gives you only the last patch,
to add d/e/f, for whatever reason.  The change in the earlier part to add
1/2 after a/b/c was not suitable for public consumption yet, perhaps.  The
patch comes with 3 pre-context lines as usual, what you see begins with
three blank lines, and has an addition of d/e/f.

You have already stripped the blank lines at the end when you applied the
original one; you do not know how many blank lines at the end you lost
when you applied it (and you do not _want_ to record that when applying
either).

You cannot go by the line numbers on the "@@ -l,k +m,n @@" header line you
see in the second patch you received.  On that line, only k and n are
reliable numbers (the must match the patch text).  l and m are unreliable;
being able to apply even if the text you have at hand does not exactly
match l and m is the whole point of transmitting the change in the patch
format.  The _only_ information you have usable at that point is that
there are _at least_ 3 blank lines before the addition, and perhaps the
fact that the hunk ends without post context lines.  The latter tells you
that it must apply at the end, but still doesn't tell you how many blanks
you need to add back at EOF before applying the patch.

> Do you see any fatal flaws that I don't see?

I don't know---the above is what I already said in my previous message and
it already looked fatal enough to me.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add a test for a problem in "rebase --whitespace=fix"
  2010-02-08  0:15     ` Junio C Hamano
@ 2010-02-08  7:37       ` Björn Gustavsson
  2010-02-09 21:58         ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Björn Gustavsson @ 2010-02-08  7:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

2010/2/8 Junio C Hamano <gitster@pobox.com>:

> You cannot go by the line numbers on the "@@ -l,k +m,n @@" header line you
> see in the second patch you received.  On that line, only k and n are
> reliable numbers (the must match the patch text).  l and m are unreliable;
> being able to apply even if the text you have at hand does not exactly
> match l and m is the whole point of transmitting the change in the patch
> format.  The _only_ information you have usable at that point is that
> there are _at least_ 3 blank lines before the addition, and perhaps the
> fact that the hunk ends without post context lines.  The latter tells you
> that it must apply at the end, but still doesn't tell you how many blanks
> you need to add back at EOF before applying the patch.

I agree. The information is not enough if you apply one patch
at the time.

But my usage case that my test tries to demonstrate is different:

I already have a number of commits in my repository (received
either by pulling or applying a whole series of patches at once).

I then do, for example:

git rebase --whitespace=fix HEAD~4

to clean up the existing commits.

That rebase uses "git apply" internally seems like an implementation
detail that I as a user of rebase don't care about. I just expect it
to work.

I see at least two possible ways to implement that:

1. Have "git rebase" give "git apply" a special option so that it
will apply patches beyond the end of file (and trusting the
line numbers in the chunks).

2. Having "git rebase" remember the number of blanks line that
was removed in each previous file in previous fixed commits
and add them back just before invoking "git apply".

It is possible that it is too much work to implement it to be
worthwhile (especially solution 2), but I do think it is possible.

If you don't agree, fair enough. In that case I will hold on
to the test case and only re-submit it if I can also include
a fix.

-- 
Björn Gustavsson, Erlang/OTP, Ericsson AB

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add a test for a problem in "rebase --whitespace=fix"
  2010-02-08  7:37       ` Björn Gustavsson
@ 2010-02-09 21:58         ` Junio C Hamano
  2010-02-10 20:20           ` Björn Gustavsson
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2010-02-09 21:58 UTC (permalink / raw)
  To: Björn Gustavsson; +Cc: git

Björn Gustavsson <bgustavsson@gmail.com> writes:

> I see at least two possible ways to implement that:
>
> 1. Have "git rebase" give "git apply" a special option so that it
> will apply patches beyond the end of file (and trusting the
> line numbers in the chunks).
>
> 2. Having "git rebase" remember the number of blanks line that
> was removed in each previous file in previous fixed commits
> and add them back just before invoking "git apply".

I actually changed my mind after thinking about it a bit more.

I think you should be able to do the same thing as we already do in
git-apply inside match_fragment(), and without involving any "rebase"
specific hacks nor options, the result will cover most of the real-life
use cases.

The existing logic in the function says "We were told to find a location
in the img (i.e. the text being patched) where preimage appears.  We will
replace the copy of the preimage in img with the corresponding postimage.
Unfortunately, the preimage does not exactly match, but if we consider
that we may have applied earlier patches with whitespace=fix, we can see
that the given preimage matches with this location in our img except for
whitespace differences."

When it finds the place that "fuzzily" matches the preimage, it adjusts
the given preimage to match what we have (i.e. pretends that the patch
sender sent a patch based on a version of the text with whitespace fixes
we made in ours already applied), and then let the common logic replace
that copy of preimage with postimage in img.

The hunk is applied successfully, and we are happy.

Using the same logic, your "a/b/c trail" -> "a/b/c gap d/e/f" example
would work like this.

After applying the first patch with blank-at-eof, we will have "a/b/c" (no
trailing blanks) in img.

The second patch comes.  It looks like this:

    @@ -l,3 +k,6 @@
     b
     c
     
    +d
    +e
    +f
    
This tells you to match "'b c blank' at the end" (lack of trailing context
would trigger match_end hint, I think).  So you go ahead and look at your
img, which is "a/b/c" (no trailing blanks).  It does not match.

Just like existing logic in match_fragment() knows that the img might have
undergone whitespace fixes, your new logic would realize that with an
addition of one empty line to make your img "a/b/c blank", the preimage of
this hunk matches the way as you are told to search.

At that point, you adjust the preimage to match what we have (the logic is
exactly the same as blank-at-eol case---adjust the patch to pretend that
they started with a whitespace-corrected version we have).  This would
make the hunk look like this:

    @@ -l,2 +k,6 @@
     b
     c
    +
    +d
    +e
    +f

and by applying that adjusted hunk, you will get the expected result,
namely, "a/b/c gap d/e/f".

Notice that we did all that without ever looking at 'l' and 'k' (hunk
offsets)?

I'd like to see this logic (and only this logic, without relying on the
diff hunk offset numbers at all) done first, because it is very much in
line with what we already do, and more importantly, because it is a more
general solution that is applicable outside the context of rebase.

This of course will not catch the case where you used to have added tons
of blanks at the end in the earlier patch (and losing sight of how many
blanks you removed while applying it).  You would need to build a special
case that probably relies on the diff hunk offset, and trigger that
additional logic in rebase (i.e. the caller that _knows_ the hunk offsets
are reliable).  That special case may not involve a change to "git apply"
at all, as you suggested as an alternative (2), as you can do that all
inside "git rebase".  But I'd rather like to see a solution that does not
rely on the special case as one patch that is independent from rebase, and
the special case built on top of it, as a separate patch.

After all, if you _are_ applying with whitespace=fix and blank-at-eof is
in effect, it is very likely that the nature of the contents in that path
is something that blank lines in the middle does not matter except for
readability (e.g. something like .git/config file format); the fact that
it is safe to strip the blank-at-eof strongly suggests that blanks do not
have semantic meaning and are there purely for readability.  In contents
of such a nature, it would not matter if you lost indeterminate number of
blank lines in the middle by applying two patches, the first one leaving
100 blank lines at the end and then the second one adding some non-blank
lines at the end.  It might even turn out to be a moot point to worry
about the special case hack if that is the case.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add a test for a problem in "rebase --whitespace=fix"
  2010-02-09 21:58         ` Junio C Hamano
@ 2010-02-10 20:20           ` Björn Gustavsson
  0 siblings, 0 replies; 7+ messages in thread
From: Björn Gustavsson @ 2010-02-10 20:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

2010/2/9 Junio C Hamano <gitster@pobox.com>:
> I actually changed my mind after thinking about it a bit more.

Thanks for thinking more about it and for your explanation
about the algorithm

I am quite busy at the moment, but I hope to get started
on a patch series after the end of February.

> I'd like to see this logic (and only this logic, without relying on the
> diff hunk offset numbers at all) done first, because it is very much in
> line with what we already do, and more importantly, because it is a more
> general solution that is applicable outside the context of rebase.

I'll try to implement that logic. It seems that it should cover
99% of all real-world use cases. I will not worry about the general
case (tons of blanks) until I have got the first patch working.

-- 
Björn Gustavsson, Erlang/OTP, Ericsson AB

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-02-10 20:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-07  8:10 [PATCH] Add a test for a problem in "rebase --whitespace=fix" Björn Gustavsson
2010-02-07 18:38 ` Junio C Hamano
2010-02-07 22:44   ` Björn Gustavsson
2010-02-08  0:15     ` Junio C Hamano
2010-02-08  7:37       ` Björn Gustavsson
2010-02-09 21:58         ` Junio C Hamano
2010-02-10 20:20           ` Björn Gustavsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).