From: Thomas Rast <trast@student.ethz.ch>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: git@vger.kernel.org
Subject: Re: [ILLUSTRATION PATCH] color-words: take an optional regular expression describing words
Date: Fri, 9 Jan 2009 13:24:33 +0100 [thread overview]
Message-ID: <200901091324.40583.trast@student.ethz.ch> (raw)
In-Reply-To: <alpine.DEB.1.00.0901091255230.30769@pacific.mpi-cbg.de>
[-- Attachment #1: Type: text/plain, Size: 2785 bytes --]
Johannes Schindelin wrote:
>
> In some applications, words are not delimited by white space. To
> allow for that, you can specify a regular expression describing
> what makes a word with
>
> git diff --color-words='^[A-Za-z0-9]*'
[...]
> > Intuitively, all you would have to do is to replace this part in
> > diff_words_show()
> >
> > for (i = 0; i < minus.size; i++)
> > if (isspace(minus.ptr[i]))
> > minus.ptr[i] = '\n';
> >
> > by a loop finding the next word boundary.
[...]
> > However, as I said, I think it would be much more intuitive to
> > characterize the _words_ instead of the _word boundaries_.
That doesn't work. You cannot overwrite actual content in the strings
to be diffed with newlines. The current --color-words exploits the
fact that we don't care about spaces anyway, so we might as well
replace them with newlines, but we _do_ care about the words and in
the regexed version, you have no guarantees about where they might start.
To wit:
thomas@thomas:~/tmp/foo(master)$ cat >foo
foo_bar_baz
quux
thomas@thomas:~/tmp/foo(master)$ git add foo
thomas@thomas:~/tmp/foo(master)$ git ci -m initial
[master (root-commit)]: created f110c6c: "initial"
1 files changed, 2 insertions(+), 0 deletions(-)
create mode 100644 foo
thomas@thomas:~/tmp/foo(master)$ cat >foo
foo_
ar_
az
quux
thomas@thomas:~/tmp/foo(master)$ git diff
diff --git i/foo w/foo
index 5b34f11..a2762c6 100644
--- i/foo
+++ w/foo
@@ -1,2 +1,4 @@
-foo_bar_baz
+foo_
+ar_
+az
quux
thomas@thomas:~/tmp/foo(master)$ git diff --color-words
diff --git i/foo w/foo
index 5b34f11..a2762c6 100644
--- i/foo
+++ w/foo
@@ -1,2 +1,4 @@
foo_bar_bafoo_
ar_
az
quux
thomas@thomas:~/tmp/foo(master)$ git diff --color-words='[a-zA-Z]+_?'
diff --git i/foo w/foo
index 5b34f11..a2762c6 100644
--- i/foo
+++ w/foo
@@ -1,2 +1,4 @@
quux
Even without the colours, you can see that it has a blind spot for
changes around a newline. Perhaps there is an easier way to remember
them, but we definitely cannot *forget* about the word boundaries.
That being said, even though my patch correctly sees the changes, the
above test case also exposes some sort of string overrun :-(
> > And I would like to keep the default as-is (together _with_ the
> > performance. IOW if the user did not specify a regexp, it should fall
> > back to what it does now, which is slow enough).
That's definitely a valid request.
I'll come up with a fixed patch, and probably make it both
funcname-like (Jeff's idea) and command line configurable.
--
Thomas Rast
trast@{inf,student}.ethz.ch
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
next prev parent reply other threads:[~2009-01-09 12:25 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-09 0:05 [RFC PATCH] make diff --color-words customizable Thomas Rast
2009-01-09 0:25 ` Johannes Schindelin
2009-01-09 0:50 ` Thomas Rast
2009-01-09 11:15 ` Johannes Schindelin
2009-01-09 11:59 ` [ILLUSTRATION PATCH] color-words: take an optional regular expression describing words Johannes Schindelin
2009-01-09 12:24 ` Thomas Rast [this message]
2009-01-09 13:05 ` Teemu Likonen
2009-01-10 0:57 ` [PATCH v2] make diff --color-words customizable Thomas Rast
2009-01-10 1:50 ` Jakub Narebski
2009-01-10 11:37 ` Johannes Schindelin
2009-01-10 13:36 ` Jakub Narebski
2009-01-10 14:08 ` Johannes Schindelin
2009-01-12 23:59 ` Jakub Narebski
2009-01-13 0:40 ` Johannes Schindelin
2009-01-10 17:53 ` Davide Libenzi
2009-01-13 0:52 ` Jakub Narebski
2009-01-13 18:50 ` Davide Libenzi
2009-01-10 10:49 ` Johannes Schindelin
2009-01-10 11:25 ` Thomas Rast
2009-01-10 11:45 ` Johannes Schindelin
2009-01-11 1:34 ` Junio C Hamano
2009-01-11 10:27 ` [PATCH v3 0/4] customizable --color-words Thomas Rast
2009-01-11 10:27 ` [PATCH v3 1/4] word diff: comments, preparations for regex customization Thomas Rast
2009-01-11 13:41 ` Johannes Schindelin
2009-01-11 19:49 ` Johannes Schindelin
2009-01-11 22:19 ` Junio C Hamano
2009-01-11 10:27 ` [PATCH v3 2/4] word diff: customizable word splits Thomas Rast
2009-01-11 22:20 ` Junio C Hamano
2009-01-11 10:27 ` [PATCH v3 3/4] word diff: make regex configurable via attributes Thomas Rast
2009-01-11 23:20 ` Junio C Hamano
2009-01-11 10:27 ` [PATCH v3 4/4] word diff: test customizable word splits Thomas Rast
2009-01-09 9:53 ` [RFC PATCH] make diff --color-words customizable Jeff King
2009-01-09 11:18 ` Johannes Schindelin
2009-01-09 11:22 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200901091324.40583.trast@student.ethz.ch \
--to=trast@student.ethz.ch \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.