Git development
 help / color / mirror / Atom feed
* [RFC] solving a bug with hunks starting at line 1 in git apply
@ 2015-06-01 17:07 Remi LESPINET
  2015-06-01 17:16 ` Junio C Hamano
  2015-06-01 18:20 ` Matthieu Moy
  0 siblings, 2 replies; 9+ messages in thread
From: Remi LESPINET @ 2015-06-01 17:07 UTC (permalink / raw)
  To: git


======================================================================
= 1. The bug
======================================================================

hunks of the form:

@@ -1,k +n,m @@

have a special behavior because of the variable match_begining.  For
these hunks, offset is not allowed which means that the fragment has to
match at the first line of the image. This is done for the sake of
safety (This behavior does not exist in the basic patch command). Here's
an example

first original file:

10
20
30
40

for the following diff file:

@@ -1,2 +1,2 @@
 20
-30
+35
 40

The patch will not be applied with a git apply command, but it will
with a basic patch command. However, there's no problem with the
following diff for both:

@@ -2,2 +2,2 @@
 10
-20
+25
 30

The problem comes when we have a diff file like that (which can be
obtained by splitting a hunk with git add -p for example (unsolved bug
reported in 1bf01040f0c1101f410f9caa5e715460cdd0cbe0))

@@ -1,1 +1,2 @@
+5
 10
@@ -1,3 +2,3 @@
 10
+15
-20
 30

In this case the first hunk will be treated, the image will then become

5
10
20
30

The second will try to match the lines

10
20
30

with the first lines of the image, will not succeed and will return that
it is not possible without trying to match the image with offsets,
because the hunk begin with 1.

Moreover, there are cases where both the git apply and the
patch command will work, giving different outputs.

second original file:

10
10
20

diff file associated:

@@ -1,1 +1,2 @@
+10
 10
@@ -1,2 +2,3 @@
 10
+cc
 10

With git apply, the problem will silently appear, and the command will
give the following output:

10
cc
10
10
20

whereas the output given by the simple patch command will be:

10
10
cc
10
20

======================================================================
= 2. Correction
======================================================================

I see mainly two ways to fix the bug:

********************************************************************
* 2.1 first method (the most basic)
********************************************************************

The most basic is to change the code so that the special behavior
only affects the hunks of the form

@@ -1,k +1,m @@

Which makes the previous diff file possible to use. And fixes the
bug reported in 1bf01040f0c1101f410f9caa5e715460cdd0cbe0

Note that this modification, which changes positively behavior of hunks

@@ -1,k +l,m @@

where l is not 1, when there's a hunk

@@ -1,a +1,b @@

above, also changes the behavior when there's no such bloc above:

For example: the diff file

@@ -1,2 +2,2 @@
 20
-30
+35
 40

would have given an error before this patch (with the first original file)
whereas it works with the modification introduced by the patch.

**********************************************************************
* 2.2 second method
**********************************************************************

The other method, is to pass a base_line and an offset to the match_fragment
function instead of passing directly the line to test.

This does not involve any change in the behavior of apply, and therefore
has not the problem of the most basic method.

======================================================================
= 3. Question
======================================================================

I personnaly think that the second method is better, but this implies more
modifications, and it will be useful only to keep the failing behavior of 
some hand-modified patches... Do you think that the second method is
worth implementing ?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] solving a bug with hunks starting at line 1 in git apply
  2015-06-01 17:07 [RFC] solving a bug with hunks starting at line 1 in git apply Remi LESPINET
@ 2015-06-01 17:16 ` Junio C Hamano
  2015-06-01 17:30   ` Junio C Hamano
  2015-06-01 18:00   ` Matthieu Moy
  2015-06-01 18:20 ` Matthieu Moy
  1 sibling, 2 replies; 9+ messages in thread
From: Junio C Hamano @ 2015-06-01 17:16 UTC (permalink / raw)
  To: Remi LESPINET; +Cc: git

Remi LESPINET <remi.lespinet@ensimag.grenoble-inp.fr> writes:

> first original file:
>
> 10
> 20
> 30
> 40
>
> for the following diff file:
>
> @@ -1,2 +1,2 @@
>  20
> -30
> +35
>  40
>
> The patch will not be applied with a git apply command, but it will
> with a basic patch command.

Doesn't that merely indicate a bug in "patch", though?

The patch text says that the preimage must begin with "20" but the
target begins with "10".

I think we have the same "the boundary must match" logic not just
at the beginning of the file but also at the end of the file.  No
matter what the answer to the above question is, we would need to
do the same for both the beginning and the end, I would think.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] solving a bug with hunks starting at line 1 in git apply
  2015-06-01 17:16 ` Junio C Hamano
@ 2015-06-01 17:30   ` Junio C Hamano
  2015-06-01 18:00   ` Matthieu Moy
  1 sibling, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2015-06-01 17:30 UTC (permalink / raw)
  To: Remi LESPINET; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Remi LESPINET <remi.lespinet@ensimag.grenoble-inp.fr> writes:
>
>> first original file:
>>
>> 10
>> 20
>> 30
>> 40
>>
>> for the following diff file:
>>
>> @@ -1,2 +1,2 @@
>>  20
>> -30
>> +35
>>  40
>>
>> The patch will not be applied with a git apply command, but it will
>> with a basic patch command.
>
> Doesn't that merely indicate a bug in "patch", though?

This needs a bit gentler explanation.  Imagine you are doing a usual
"git diff" with two lines of context, with a longer original.  A
change to do s/30/35/ would look like this:

@@ -1,5 +1,5 @@
 10
 20
-30
+35
 40
 50

And an attempt to apply this patch to an original that does not have
10, 20, 30, 40 and 50 in this order must fail.

On the other hand, if the change were instead s/20/35/, then the
diff output would look like this instead:

@@ -1,4 +1,4 @@
 10
-20
+25
 30
 40

Notice that the pre-context lines in this second example is only one
line.  Is it giving the same degree of safety if we rejected an
attempt to apply this patch only when the original does not have 10,
20, 30 and 40 in this order?

No.  Because we are doing two-line context patch, the patch is not
just saying "this change applies to a place where the first line is
10".  It also is saying "there is no line before that line".

To answer my own question in the previous message, we cannot tell
only from that example which of "git apply" and "patch" is wrong.
If you generated the patch with "git diff -U1", then "git apply"
that insists that there must be no line before that "20" is wrong.
If you generated the patch with "git diff" with the default two-line
context, then "patch" is wrong if it does not make sure "20" is the
first line.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] solving a bug with hunks starting at line 1 in git apply
  2015-06-01 17:16 ` Junio C Hamano
  2015-06-01 17:30   ` Junio C Hamano
@ 2015-06-01 18:00   ` Matthieu Moy
  2015-06-01 18:37     ` Junio C Hamano
  1 sibling, 1 reply; 9+ messages in thread
From: Matthieu Moy @ 2015-06-01 18:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Remi LESPINET, git

Junio C Hamano <gitster@pobox.com> writes:

> Remi LESPINET <remi.lespinet@ensimag.grenoble-inp.fr> writes:
>
>> first original file:
>>
>> 10
>> 20
>> 30
>> 40
>>
>> for the following diff file:
>>
>> @@ -1,2 +1,2 @@
>>  20
>> -30
>> +35
>>  40
>>
>> The patch will not be applied with a git apply command, but it will
>> with a basic patch command.
>
> Doesn't that merely indicate a bug in "patch", though?

No, it's just that patch does a fuzzy match in this case:

$ patch < patch.diff 
patching file pre.txt
Hunk #1 succeeded at 2 with fuzz 1 (offset 1 line).

It's different from the other case:

Remi LESPINET <remi.lespinet@ensimag.grenoble-inp.fr> writes:

> @@ -1,1 +1,2 @@
> +5
>  10
> @@ -1,3 +2,3 @@
>  10
> +15
> -20
>  30

With this one, I get:

$ git apply < p2.diff   
error: patch failed: pre.txt:1
error: pre.txt: patch does not apply
$ patch < p2.diff 
patching file pre.txt

=> no fuzzy matching for patch, git apply should actually work.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] solving a bug with hunks starting at line 1 in git apply
  2015-06-01 17:07 [RFC] solving a bug with hunks starting at line 1 in git apply Remi LESPINET
  2015-06-01 17:16 ` Junio C Hamano
@ 2015-06-01 18:20 ` Matthieu Moy
  1 sibling, 0 replies; 9+ messages in thread
From: Matthieu Moy @ 2015-06-01 18:20 UTC (permalink / raw)
  To: Remi LESPINET; +Cc: git

Remi LESPINET <remi.lespinet@ensimag.grenoble-inp.fr> writes:

> ======================================================================
> = 2. Correction
> ======================================================================
>
> I see mainly two ways to fix the bug:
>
> ********************************************************************
> * 2.1 first method (the most basic)
> ********************************************************************
>
> The most basic is to change the code so that the special behavior
> only affects the hunks of the form
>
> @@ -1,k +1,m @@

I do not think that this would work in all cases. It seems git apply has
issues with overlapping contexts, not just with hunks starting at line
1. See:

$ cat pre.txt
-1
0
10
20
30
40
$ cat p2.diff 
--- pre.txt
+++ pre.txt
@@ -2,2 +2,3 @@
 0
+5
 10
@@ -3,3 +4,3 @@
 10
+15
-20
 30
$ git apply p2.diff
error: patch failed: pre.txt:3
error: pre.txt: patch does not apply
$ patch < p2.diff 
patching file pre.txt
$ cat pre.txt
-1
0
5
10
15
30
40

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] solving a bug with hunks starting at line 1 in git apply
  2015-06-01 18:00   ` Matthieu Moy
@ 2015-06-01 18:37     ` Junio C Hamano
  2015-06-01 18:47       ` Matthieu Moy
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2015-06-01 18:37 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Remi LESPINET, git

Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:

>> @@ -1,1 +1,2 @@
>> +5
>>  10
>> @@ -1,3 +2,3 @@
>>  10
>> +15
>> -20
>>  30
>
> With this one, I get:
>
> $ git apply < p2.diff   
> error: patch failed: pre.txt:1
> error: pre.txt: patch does not apply
> $ patch < p2.diff 
> patching file pre.txt
>
> => no fuzzy matching for patch, git apply should actually work.

I am not sure what you are trying to do with that patch that tries
to touch the same line twice.  Is this the same old laziness coming
back to bite us, the one that we attempted to work around with
933e44d3 ("add -p": work-around an old laziness that does not
coalesce hunks, 2011-04-06)?

In other words, isn't the right fix to coalesce that input, so that
the second hunk does *not* require fuzzy application in the first
place?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] solving a bug with hunks starting at line 1 in git apply
  2015-06-01 18:37     ` Junio C Hamano
@ 2015-06-01 18:47       ` Matthieu Moy
  2015-06-01 18:53         ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Matthieu Moy @ 2015-06-01 18:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Remi LESPINET, git

Junio C Hamano <gitster@pobox.com> writes:

> Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:
>
>>> @@ -1,1 +1,2 @@
>>> +5
>>>  10
>>> @@ -1,3 +2,3 @@
>>>  10
>>> +15
>>> -20
>>>  30
>>
>> With this one, I get:
>>
>> $ git apply < p2.diff   
>> error: patch failed: pre.txt:1
>> error: pre.txt: patch does not apply
>> $ patch < p2.diff 
>> patching file pre.txt
>>
>> => no fuzzy matching for patch, git apply should actually work.
>
> I am not sure what you are trying to do with that patch that tries
> to touch the same line twice.  Is this the same old laziness coming
> back to bite us, the one that we attempted to work around with
> 933e44d3 ("add -p": work-around an old laziness that does not
> coalesce hunks, 2011-04-06)?

Indeed, "git apply" works with --allow-overlap in this case. But this is
not sufficient to fix "git add -p" which already uses it. So, there's
something else.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] solving a bug with hunks starting at line 1 in git apply
  2015-06-01 18:47       ` Matthieu Moy
@ 2015-06-01 18:53         ` Junio C Hamano
  2015-06-01 21:37           ` Remi Lespinet
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2015-06-01 18:53 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Remi LESPINET, git

Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:

>> I am not sure what you are trying to do with that patch that tries
>> to touch the same line twice.  Is this the same old laziness coming
>> back to bite us, the one that we attempted to work around with
>> 933e44d3 ("add -p": work-around an old laziness that does not
>> coalesce hunks, 2011-04-06)?
>
> Indeed, "git apply" works with --allow-overlap in this case. But this is
> not sufficient to fix "git add -p" which already uses it. So, there's
> something else.

I do not have time to go back to the list archive myself at this
moment, but I suspect that in the discussion around the time back
when 0beee4c6 (git-add--interactive: remove hunk coalescing,
2008-07-02) was done, we'd find some material for me to say "I told
you so" X-< to those who added that laziness to "add--interactive".

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] solving a bug with hunks starting at line 1 in git apply
  2015-06-01 18:53         ` Junio C Hamano
@ 2015-06-01 21:37           ` Remi Lespinet
  0 siblings, 0 replies; 9+ messages in thread
From: Remi Lespinet @ 2015-06-01 21:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Matthieu Moy, git

Ok, Thanks for all the informations

> Notice that the pre-context lines in this second example is only one
> line.  Is it giving the same degree of safety if we rejected an
> attempt to apply this patch only when the original does not have 10,
> 20, 30 and 40 in this order?
> 
> No.  Because we are doing two-line context patch, the patch is not
> just saying "this change applies to a place where the first line is
> 10".  It also is saying "there is no line before that line".

Yes, It's obvious that "patch" has not the same degree of safety as the
git apply command. But I thought that any patch working with
git apply should work with the "patch" command and give the same output,
It seems that this is not true, regarding

file:

10
10
10
10
20

diff:

@@ -1,2 +1,3 @@
+10
 10
 10
@@ -1,4 +2,5 @@
 10
 10
+cc
 10
 10

(I changed it to have 2 line context).  Of course these are
hand-written patches, which can't be obtained with a diff (except with
the non coalescing git add -p as you said yourself). Wouldn't it 
be a problem?

> In other words, isn't the right fix to coalesce that input, so that
> the second hunk does *not* require fuzzy application in the first
> place?

Yes, you're right, that will be fixed if we restore coalescing, I
didn't thought of this possibility. This will cause fewer split but
we have the edit option anyway.


Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-06-01 21:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-01 17:07 [RFC] solving a bug with hunks starting at line 1 in git apply Remi LESPINET
2015-06-01 17:16 ` Junio C Hamano
2015-06-01 17:30   ` Junio C Hamano
2015-06-01 18:00   ` Matthieu Moy
2015-06-01 18:37     ` Junio C Hamano
2015-06-01 18:47       ` Matthieu Moy
2015-06-01 18:53         ` Junio C Hamano
2015-06-01 21:37           ` Remi Lespinet
2015-06-01 18:20 ` Matthieu Moy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox