* [PATCH] make --color-words separate word on ispunct
@ 2008-04-12 10:33 sgala
2008-04-12 15:23 ` Johannes Schindelin
0 siblings, 1 reply; 4+ messages in thread
From: sgala @ 2008-04-12 10:33 UTC (permalink / raw)
To: git; +Cc: Santiago Gala
Note that this may actually be harmful when trying to spot punctuation
changes, but for this use case I don't think color-words is helping
now either.
Signed-off-by: Santiago Gala <sgala@apache.org>
---
diff.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/diff.c b/diff.c
index 8022e67..d301fcc 100644
--- a/diff.c
+++ b/diff.c
@@ -448,7 +448,7 @@ static void diff_words_show(struct diff_words_data *diff_words)
minus.ptr = xmalloc(minus.size);
memcpy(minus.ptr, diff_words->minus.text.ptr, minus.size);
for (i = 0; i < minus.size; i++)
- if (isspace(minus.ptr[i]))
+ if (isspace(minus.ptr[i]) || ispunct(minus.ptr[i]))
minus.ptr[i] = '\n';
diff_words->minus.current = 0;
@@ -456,7 +456,7 @@ static void diff_words_show(struct diff_words_data *diff_words)
plus.ptr = xmalloc(plus.size);
memcpy(plus.ptr, diff_words->plus.text.ptr, plus.size);
for (i = 0; i < plus.size; i++)
- if (isspace(plus.ptr[i]))
+ if (isspace(plus.ptr[i]) || ispunct(plus.ptr[i]))
plus.ptr[i] = '\n';
diff_words->plus.current = 0;
--
1.5.5.44.gdfa65.dirty
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] make --color-words separate word on ispunct
2008-04-12 10:33 [PATCH] make --color-words separate word on ispunct sgala
@ 2008-04-12 15:23 ` Johannes Schindelin
2008-04-12 15:32 ` Ping Yin
0 siblings, 1 reply; 4+ messages in thread
From: Johannes Schindelin @ 2008-04-12 15:23 UTC (permalink / raw)
To: sgala; +Cc: git, Santiago Gala
Hi,
On Sat, 12 Apr 2008, sgala@hisitech.com wrote:
> Note that this may actually be harmful when trying to spot punctuation
> changes, but for this use case I don't think color-words is helping now
> either.
I do not know how commonly supported ispunct(), therefore I do not like
the patch too much.
Besides, since long ago I want to make the list of boundary characters
configurable, preferably as a tr(1) style list, but I have not come around
to do that yet.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] make --color-words separate word on ispunct
2008-04-12 15:23 ` Johannes Schindelin
@ 2008-04-12 15:32 ` Ping Yin
2008-04-12 17:50 ` Santiago Gala
0 siblings, 1 reply; 4+ messages in thread
From: Ping Yin @ 2008-04-12 15:32 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: sgala, git, Santiago Gala
On Sat, Apr 12, 2008 at 11:23 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
>
> On Sat, 12 Apr 2008, sgala@hisitech.com wrote:
>
> > Note that this may actually be harmful when trying to spot punctuation
> > changes, but for this use case I don't think color-words is helping now
> > either.
>
> I do not know how commonly supported ispunct(), therefore I do not like
> the patch too much.
>
> Besides, since long ago I want to make the list of boundary characters
> configurable, preferably as a tr(1) style list, but I have not come around
> to do that yet.
>
It is so good an idea. I look forward to it. Futher, should
--color-words support
multibyte characters where every character is a boundary?
--
Ping Yin
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] make --color-words separate word on ispunct
2008-04-12 15:32 ` Ping Yin
@ 2008-04-12 17:50 ` Santiago Gala
0 siblings, 0 replies; 4+ messages in thread
From: Santiago Gala @ 2008-04-12 17:50 UTC (permalink / raw)
To: Ping Yin; +Cc: Johannes Schindelin, git
El sáb, 12-04-2008 a las 23:32 +0800, Ping Yin escribió:
> On Sat, Apr 12, 2008 at 11:23 PM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> > Hi,
> >
> >
> > On Sat, 12 Apr 2008, sgala@hisitech.com wrote:
> >
> > > Note that this may actually be harmful when trying to spot punctuation
> > > changes, but for this use case I don't think color-words is helping now
> > > either.
> >
> > I do not know how commonly supported ispunct(), therefore I do not like
> > the patch too much.
> >
I didn't like the patch that much either, but at least it was a quick
proof of concept. :)
re: support of ispunct, ispunct checks, according to the linux man page,
for:
any printable character which is not a space or an alphanumeric
character.
so isspace(c) || ispunct(c) -> isprint(c) && !isalnum(c)
> > Besides, since long ago I want to make the list of boundary characters
> > configurable, preferably as a tr(1) style list, but I have not come around
> > to do that yet.
> >
That would be cool, it was my first thought until I saw this "easy try".
But I'm not a C programmer, I was just trying to spot the correctness of
a few name additions in lines of comma separated ids of 100 names or
something like that. The patch I sent is not perfect, but achieved 80%
of what I wanted with 10 minutes of effort (including build, test and
sending the patch).
On the other hand, while --color-words is very useful for text or
detecting typos, with big text changes it sometimes gives worse results
than --color, see for instance, on the git repo, the second hunk of
git diff --stat -p --color-words
f59774add488a6c5fb440a4aaa7255f594b1027d^ -- builtin-fetch.c
(and just --color) Not sure how to fix it, or, ideally, having some
automated way to switch between line-oriented coloring and word-oriented
coloring depending of density of changes.
>
> It is so good an idea. I look forward to it. Futher, should
> --color-words support
> multibyte characters where every character is a boundary?
>
This would require more changes, to the
iswspace/iswpunct/iswprint/iswalnum functions, with associated change
from chars to wide chars.
Regards
Santiago
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-04-12 17:51 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-12 10:33 [PATCH] make --color-words separate word on ispunct sgala
2008-04-12 15:23 ` Johannes Schindelin
2008-04-12 15:32 ` Ping Yin
2008-04-12 17:50 ` Santiago Gala
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).