git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Robert Fitzsimons <robfitz@273k.net>
Subject: Re: [PATCHv1bis 1/2] git apply: option to ignore whitespace  differences
Date: Fri, 3 Jul 2009 08:40:01 +0200	[thread overview]
Message-ID: <cb7bb73a0907022340l88a226egad74a275055fb972@mail.gmail.com> (raw)
In-Reply-To: <7vhbxuzlkk.fsf@alter.siamese.dyndns.org>

On Fri, Jul 3, 2009 at 1:55 AM, Junio C Hamano<gitster@pobox.com> wrote:
> By the way, I think we need to make sure your understanding of how the
> current code works matches mine before you go any further.

Souns reasonable.

> Are the words "preimage", "postimage" and "target" used consistently
> between us?  By these words, I mean:
>
>  preimage = the lines prefixed with '-' and ' ' in the patch
>
>  postimage = the lines prefixed with ' ' and '+' in the patch
>
>  target = lines in the file being patched that corresponds to the preimage

It _did_ take me a little to understand the names when I started
working on the feature, but I got on track pretty soon (at the first
segfault ;-)).

> The point of patch application is to find a block of lines in the target
> that matches preimage, and replace that block with postimage.  When the
> patch applies cleanly (which is the case we should optimize for), the
> preimage match the target byte-for-byte.  The hunk starting at line 1690
> does a memcmp of the whole thing, without ws fuzz, for this reason.  You
> do not want to touch that part with your patch (and that is why I am
> writing this message to make sure you understand what you are doing).

Of course.

> After that, as a fallback, we compare line-by-line, while fixing the
> whitespace breakage in the preimage (what the patch author based on) and
> the target (what we currently have).

> [...] preimage and target won't match byte-for-byte, but by
> applying the whitespace breakage on each of the preimage line and the
> corresponding target line, they will match in either of the above cases.
> While doing this "convert-and-match", we prepare a version of preimage
> with whitespace breakage fixed to give to update_pre_post_images() at the
> end of the function in fixed_buf.

Indeed. This is why in my 2/2 patch I do a similar operation to bring
the preimage whitespace to match the target whitespace if matching was
done ignoring whitespace (but we never got to that part, for obvious
reasons).

> This is another point I am worried about your patch.  Suppose you have this
> target:
>
>    a a a
>    b b b
>    c c
>    d
>    e e
>
> And we have a broken patch that needs --ignore-whitespace to apply:
>
>    diff --git a/file b/file
>    index xxxxxx..yyyyyy 100644
>    @@ -1,4, +1,5 @@
>     a  a  a
>     b b  b
>    +q
>     c  c
>       d
>
> Your preimage is "a  a  a\nb b  b\nc  c\n  d\n",
> target is        "a a a\nb b b\nc c\nd\ne e\n",
> and postimage is "a  a  a\nb b  b\nq\nc  c\n  d\n".
>
> Wouldn't you want to have this as the result of patch application?
>
>    a a a
>    b b b
>    q
>    c c
>    d
>    e e
>
> With whitespace squashed, the preimage would match the target (perhaps
> after fixing line_matches()), but wsfix_copy() called while we fix each
> preimage line won't have changed anything in the fixed_buf that is to
> become the new preimage, and update_pre_post_images() while copying the
> fixed preimage to the postimage won't have corrected "a a a" back to "a a
> a" that was in the target as the result.
>
> So I suspect that you would instead end up with:
>
>    a  a  a
>    b b  b
>    c  c
>      d
>    e e

This is indeed the case with my 1/2 patch: no whitespace adjustment is
done on the pre- and postimage when the preimage and target match with
whitespace fuzz and ignore_whitespace is active. In the first RFC I
sent I expressely mentioned that this was not what I liked about my
patch. When I first sent a _series_, it was made of two patches, the
second of which served the purpose of realigning the whitespaces of
the patch (pre and postimage) to the whitespaces of the target (at
least for the common lines).

> I think the intent of --ignore-whitespace is "don't worry about ws
> differences in the context when locating where to make the change", and it
> is not "I do not care about getting whitespace mangled anywhere in the
> file the patch touches."

I totally agree. This is important because it also means that when
re-diffing the applied patch you still get changes ONLY in the lines
where you SHOULD get changes, and not in the nearby context that only
had different whitespace.

I sent the thing in two patches to make it easier to review. If you
think it's more appropriate to squash them, I can do that no problem.

> correct_ws_error is special in that we can
> afford to take the fixed pre/postimage, "because we are fixing the ws
> breakage anyway", but arguably it _might_ be nicer to limit the change to
> the lines marked with '-' and '+' in the patch even in that case.

But that's a path we're not going to hit in match_fragment when
ignoring whitespace. Instead, one thing we could consider in this case
(ignore_whitespace) is to adjust the leading space in the + lines to
match the ws transformations done in the context lines, but that might
be making whitespace fixing a little too far. Or we should rename it
to whitespace=adjust...

-- 
Giuseppe "Oblomov" Bilotta

      reply	other threads:[~2009-07-03  6:40 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-02 17:48 [PATCHv1bis 0/2] git apply: cope with whitespace differences Giuseppe Bilotta
2009-07-02 17:48 ` [PATCHv1bis 1/2] git apply: option to ignore " Giuseppe Bilotta
2009-07-02 17:48   ` [PATCHv1bis 2/2] git apply: preserve original whitespace with --ignore-whitespace Giuseppe Bilotta
2009-07-02 18:27   ` [PATCHv1bis 1/2] git apply: option to ignore whitespace differences Junio C Hamano
2009-07-02 19:02     ` Giuseppe Bilotta
2009-07-02 19:28       ` Giuseppe Bilotta
2009-07-02 19:45         ` Junio C Hamano
2009-07-02 20:33           ` Giuseppe Bilotta
2009-07-02 21:00             ` Junio C Hamano
2009-07-02 21:05               ` Giuseppe Bilotta
2009-07-02 23:55             ` Junio C Hamano
2009-07-03  6:40               ` Giuseppe Bilotta [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cb7bb73a0907022340l88a226egad74a275055fb972@mail.gmail.com \
    --to=giuseppe.bilotta@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=robfitz@273k.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).