git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] make --color-words separate word on ispunct
@ 2008-04-12 10:33 sgala
  2008-04-12 15:23 ` Johannes Schindelin
  0 siblings, 1 reply; 4+ messages in thread
From: sgala @ 2008-04-12 10:33 UTC (permalink / raw)
  To: git; +Cc: Santiago Gala

Note that this may actually be harmful when trying to spot punctuation
changes, but for this use case I don't think color-words is helping
now either.

Signed-off-by: Santiago Gala <sgala@apache.org>
---
 diff.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/diff.c b/diff.c
index 8022e67..d301fcc 100644
--- a/diff.c
+++ b/diff.c
@@ -448,7 +448,7 @@ static void diff_words_show(struct diff_words_data *diff_words)
 	minus.ptr = xmalloc(minus.size);
 	memcpy(minus.ptr, diff_words->minus.text.ptr, minus.size);
 	for (i = 0; i < minus.size; i++)
-		if (isspace(minus.ptr[i]))
+		if (isspace(minus.ptr[i]) || ispunct(minus.ptr[i]))
 			minus.ptr[i] = '\n';
 	diff_words->minus.current = 0;
 
@@ -456,7 +456,7 @@ static void diff_words_show(struct diff_words_data *diff_words)
 	plus.ptr = xmalloc(plus.size);
 	memcpy(plus.ptr, diff_words->plus.text.ptr, plus.size);
 	for (i = 0; i < plus.size; i++)
-		if (isspace(plus.ptr[i]))
+		if (isspace(plus.ptr[i]) || ispunct(plus.ptr[i]))
 			plus.ptr[i] = '\n';
 	diff_words->plus.current = 0;
 
-- 
1.5.5.44.gdfa65.dirty

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] make --color-words separate word on ispunct
  2008-04-12 10:33 [PATCH] make --color-words separate word on ispunct sgala
@ 2008-04-12 15:23 ` Johannes Schindelin
  2008-04-12 15:32   ` Ping Yin
  0 siblings, 1 reply; 4+ messages in thread
From: Johannes Schindelin @ 2008-04-12 15:23 UTC (permalink / raw)
  To: sgala; +Cc: git, Santiago Gala

Hi,

On Sat, 12 Apr 2008, sgala@hisitech.com wrote:

> Note that this may actually be harmful when trying to spot punctuation 
> changes, but for this use case I don't think color-words is helping now 
> either.

I do not know how commonly supported ispunct(), therefore I do not like 
the patch too much.

Besides, since long ago I want to make the list of boundary characters 
configurable, preferably as a tr(1) style list, but I have not come around 
to do that yet.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] make --color-words separate word on ispunct
  2008-04-12 15:23 ` Johannes Schindelin
@ 2008-04-12 15:32   ` Ping Yin
  2008-04-12 17:50     ` Santiago Gala
  0 siblings, 1 reply; 4+ messages in thread
From: Ping Yin @ 2008-04-12 15:32 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: sgala, git, Santiago Gala

On Sat, Apr 12, 2008 at 11:23 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
>
>  On Sat, 12 Apr 2008, sgala@hisitech.com wrote:
>
>  > Note that this may actually be harmful when trying to spot punctuation
>  > changes, but for this use case I don't think color-words is helping now
>  > either.
>
>  I do not know how commonly supported ispunct(), therefore I do not like
>  the patch too much.
>
>  Besides, since long ago I want to make the list of boundary characters
>  configurable, preferably as a tr(1) style list, but I have not come around
>  to do that yet.
>

It is so good an idea. I look forward to it. Futher, should
--color-words support
multibyte characters where every character is a boundary?



-- 
Ping Yin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] make --color-words separate word on ispunct
  2008-04-12 15:32   ` Ping Yin
@ 2008-04-12 17:50     ` Santiago Gala
  0 siblings, 0 replies; 4+ messages in thread
From: Santiago Gala @ 2008-04-12 17:50 UTC (permalink / raw)
  To: Ping Yin; +Cc: Johannes Schindelin, git

El sáb, 12-04-2008 a las 23:32 +0800, Ping Yin escribió:
> On Sat, Apr 12, 2008 at 11:23 PM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> > Hi,
> >
> >
> >  On Sat, 12 Apr 2008, sgala@hisitech.com wrote:
> >
> >  > Note that this may actually be harmful when trying to spot punctuation
> >  > changes, but for this use case I don't think color-words is helping now
> >  > either.
> >
> >  I do not know how commonly supported ispunct(), therefore I do not like
> >  the patch too much.
> >

I didn't like the patch that much either, but at least it was a quick
proof of concept. :)

re: support of ispunct, ispunct checks, according to the linux man page,
for:

any printable character which is not a  space  or  an alphanumeric
character.

so isspace(c) || ispunct(c) -> isprint(c) && !isalnum(c)

> >  Besides, since long ago I want to make the list of boundary characters
> >  configurable, preferably as a tr(1) style list, but I have not come around
> >  to do that yet.
> >

That would be cool, it was my first thought until I saw this "easy try".
But I'm not a C programmer, I was just trying to spot the correctness of
a few name additions in lines of comma separated ids of 100 names or
something like that. The patch I sent is not perfect, but achieved 80%
of what I wanted with 10 minutes of effort (including build, test and
sending the patch).

On the other hand, while --color-words is very useful for text or
detecting typos, with big text changes it sometimes gives worse results
than --color, see for instance, on the git repo, the second hunk of

git diff --stat -p --color-words
f59774add488a6c5fb440a4aaa7255f594b1027d^ -- builtin-fetch.c

(and just --color) Not sure how to fix it, or, ideally, having some
automated way to switch between line-oriented coloring and word-oriented
coloring depending of density of changes.

> 
> It is so good an idea. I look forward to it. Futher, should
> --color-words support
> multibyte characters where every character is a boundary?
> 

This would require more changes, to the
iswspace/iswpunct/iswprint/iswalnum functions, with associated change
from chars to wide chars.

Regards
Santiago

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-04-12 17:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-12 10:33 [PATCH] make --color-words separate word on ispunct sgala
2008-04-12 15:23 ` Johannes Schindelin
2008-04-12 15:32   ` Ping Yin
2008-04-12 17:50     ` Santiago Gala

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).