git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Sixt <j6t@kdbg.org>
To: Mike Hommey <mh@glandium.org>, git@vger.kernel.org
Subject: Re: Something wrong with diff --color-words=regexp?
Date: Fri, 20 Feb 2015 08:49:09 +0100	[thread overview]
Message-ID: <54E6E6F5.2020202@kdbg.org> (raw)
In-Reply-To: <20150219235213.GA1291@glandium.org>

Am 20.02.2015 um 00:52 schrieb Mike Hommey:
> Hi,
>
> I was trying to use --color-words with a regex to check a diff, and it appears
> it displays things out of order. Am I misunderstanding what my regexp should be
> doing or is there a bug?
>
> $ git diff -U3 HEAD^ dom/base/nsDOMFileReader.cpp
> diff --git a/dom/base/nsDOMFileReader.cpp b/dom/base/nsDOMFileReader.cpp
> index 6267e0e..fa22590 100644
> --- a/dom/base/nsDOMFileReader.cpp
> +++ b/dom/base/nsDOMFileReader.cpp
> @@ -363,7 +363,7 @@ nsDOMFileReader::DoReadData(nsIAsyncInputStream* aStream, uint64_t aCount)
>         return NS_ERROR_OUT_OF_MEMORY;
>       }
>       if (mDataFormat != FILE_AS_ARRAYBUFFER) {
> -      mFileData = (char *) moz_realloc(mFileData, mDataLen + aCount);
> +      mFileData = (char *) realloc(mFileData, mDataLen + aCount);
>         NS_ENSURE_TRUE(mFileData, NS_ERROR_OUT_OF_MEMORY);
>       }
>
> $ git diff -U3 --color-words='[^ ()]' HEAD^ dom/base/nsDOMFileReader.cpp
> diff --git a/dom/base/nsDOMFileReader.cpp b/dom/base/nsDOMFileReader.cpp
> index 6267e0e..fa22590 100644
> --- a/dom/base/nsDOMFileReader.cpp
> +++ b/dom/base/nsDOMFileReader.cpp
> @@ -363,7 +363,7 @@ nsDOMFileReader::DoReadData(nsIAsyncInputStream* aStream, uint64_t aCount)
>        return NS_ERROR_OUT_OF_MEMORY;
>      }
>      if (mDataFormat != FILE_AS_ARRAYBUFFER) {
>        mFileData = (char *moz_) realloc(mFileData, mDataLen + aCount);
>        NS_ENSURE_TRUE(mFileData, NS_ERROR_OUT_OF_MEMORY);
>      }

Your regexp says that every character (with a few exceptions) by itself 
is a word. Your diff says that it deleted the words 'm', 'o', 'z', and 
'_'. So, that is not wrong.

Furthermore, your regexp says that space, '(' and ')' are whitespace. 
Whitespace is *ignored* for computation of the word difference. 
Nevertheless, --color-word mode helpfully keeps the whitespace of the 
post-image to produce readable output. In doing so, it has to choose 
whether to keep the whitespace before or after a word. It chooses to 
keep it before a word. Hence, you see the whitespace sequence ') ' 
attached in front of 'r' (of 'realloc') instead of after '*'. So, the 
procedure is a matter of choice, which sometimes does not match 
expectations.

Perhaps you meant to say

     --color-words='[^ ()]+'

to split the diff text into longer words.

-- Hannes

      reply	other threads:[~2015-02-20  7:55 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-19 23:52 Something wrong with diff --color-words=regexp? Mike Hommey
2015-02-20  7:49 ` Johannes Sixt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54E6E6F5.2020202@kdbg.org \
    --to=j6t@kdbg.org \
    --cc=git@vger.kernel.org \
    --cc=mh@glandium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).