All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Rast <trast@student.ethz.ch>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: git@vger.kernel.org
Subject: Re: [ILLUSTRATION PATCH] color-words: take an optional regular expression describing words
Date: Fri, 9 Jan 2009 13:24:33 +0100	[thread overview]
Message-ID: <200901091324.40583.trast@student.ethz.ch> (raw)
In-Reply-To: <alpine.DEB.1.00.0901091255230.30769@pacific.mpi-cbg.de>

[-- Attachment #1: Type: text/plain, Size: 2785 bytes --]

Johannes Schindelin wrote:
> 
> In some applications, words are not delimited by white space.  To
> allow for that, you can specify a regular expression describing
> what makes a word with
> 
> 	git diff --color-words='^[A-Za-z0-9]*'
[...]
> 	> Intuitively, all you would have to do is to replace this part in 
> 	> diff_words_show()
> 	> 
> 	>         for (i = 0; i < minus.size; i++)
> 	>                 if (isspace(minus.ptr[i]))
> 	>                         minus.ptr[i] = '\n';
> 	> 
> 	> by a loop finding the next word boundary.
[...]
> 	> However, as I said, I think it would be much more intuitive to 
> 	> characterize the _words_ instead of the _word boundaries_.

That doesn't work.  You cannot overwrite actual content in the strings
to be diffed with newlines.  The current --color-words exploits the
fact that we don't care about spaces anyway, so we might as well
replace them with newlines, but we _do_ care about the words and in
the regexed version, you have no guarantees about where they might start.

To wit:

  thomas@thomas:~/tmp/foo(master)$ cat >foo
  foo_bar_baz
  quux
  thomas@thomas:~/tmp/foo(master)$ git add foo
  thomas@thomas:~/tmp/foo(master)$ git ci -m initial
  [master (root-commit)]: created f110c6c: "initial"
   1 files changed, 2 insertions(+), 0 deletions(-)
   create mode 100644 foo
  thomas@thomas:~/tmp/foo(master)$ cat >foo
  foo_
  ar_
  az
  quux
  thomas@thomas:~/tmp/foo(master)$ git diff
  diff --git i/foo w/foo
  index 5b34f11..a2762c6 100644
  --- i/foo
  +++ w/foo
  @@ -1,2 +1,4 @@
  -foo_bar_baz
  +foo_
  +ar_
  +az
   quux
  thomas@thomas:~/tmp/foo(master)$ git diff --color-words
  diff --git i/foo w/foo
  index 5b34f11..a2762c6 100644
  --- i/foo
  +++ w/foo
  @@ -1,2 +1,4 @@
  foo_bar_bafoo_
  ar_
  az
  quux
  thomas@thomas:~/tmp/foo(master)$ git diff --color-words='[a-zA-Z]+_?'
  diff --git i/foo w/foo
  index 5b34f11..a2762c6 100644
  --- i/foo
  +++ w/foo
  @@ -1,2 +1,4 @@
  quux

Even without the colours, you can see that it has a blind spot for
changes around a newline.  Perhaps there is an easier way to remember
them, but we definitely cannot *forget* about the word boundaries.

That being said, even though my patch correctly sees the changes, the
above test case also exposes some sort of string overrun :-(

> 	> And I would like to keep the default as-is (together _with_ the 
> 	> performance.  IOW if the user did not specify a regexp, it should fall 
> 	> back to what it does now, which is slow enough).

That's definitely a valid request.

I'll come up with a fixed patch, and probably make it both
funcname-like (Jeff's idea) and command line configurable.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch


[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

  reply	other threads:[~2009-01-09 12:25 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-09  0:05 [RFC PATCH] make diff --color-words customizable Thomas Rast
2009-01-09  0:25 ` Johannes Schindelin
2009-01-09  0:50   ` Thomas Rast
2009-01-09 11:15     ` Johannes Schindelin
2009-01-09 11:59       ` [ILLUSTRATION PATCH] color-words: take an optional regular expression describing words Johannes Schindelin
2009-01-09 12:24         ` Thomas Rast [this message]
2009-01-09 13:05           ` Teemu Likonen
2009-01-10  0:57             ` [PATCH v2] make diff --color-words customizable Thomas Rast
2009-01-10  1:50               ` Jakub Narebski
2009-01-10 11:37                 ` Johannes Schindelin
2009-01-10 13:36                   ` Jakub Narebski
2009-01-10 14:08                     ` Johannes Schindelin
2009-01-12 23:59                       ` Jakub Narebski
2009-01-13  0:40                         ` Johannes Schindelin
2009-01-10 17:53                     ` Davide Libenzi
2009-01-13  0:52                       ` Jakub Narebski
2009-01-13 18:50                         ` Davide Libenzi
2009-01-10 10:49               ` Johannes Schindelin
2009-01-10 11:25                 ` Thomas Rast
2009-01-10 11:45                   ` Johannes Schindelin
2009-01-11  1:34                     ` Junio C Hamano
2009-01-11 10:27                       ` [PATCH v3 0/4] customizable --color-words Thomas Rast
2009-01-11 10:27                       ` [PATCH v3 1/4] word diff: comments, preparations for regex customization Thomas Rast
2009-01-11 13:41                         ` Johannes Schindelin
2009-01-11 19:49                         ` Johannes Schindelin
2009-01-11 22:19                         ` Junio C Hamano
2009-01-11 10:27                       ` [PATCH v3 2/4] word diff: customizable word splits Thomas Rast
2009-01-11 22:20                         ` Junio C Hamano
2009-01-11 10:27                       ` [PATCH v3 3/4] word diff: make regex configurable via attributes Thomas Rast
2009-01-11 23:20                         ` Junio C Hamano
2009-01-11 10:27                       ` [PATCH v3 4/4] word diff: test customizable word splits Thomas Rast
2009-01-09  9:53 ` [RFC PATCH] make diff --color-words customizable Jeff King
2009-01-09 11:18   ` Johannes Schindelin
2009-01-09 11:22     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200901091324.40583.trast@student.ethz.ch \
    --to=trast@student.ethz.ch \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.