* Potential bug in --color-words output @ 2022-10-28 21:08 Simeon Krastnikov [not found] ` <198be0ca-4768-d3ce-caf3-bce4b1bb1e49@gmail.com> ` (2 more replies) 0 siblings, 3 replies; 4+ messages in thread From: Simeon Krastnikov @ 2022-10-28 21:08 UTC (permalink / raw) To: git Hello, Given an initial file with the contents "not to be", which I then change to "to be", the output of 'git diff --color-words', is notto be with the first three letters colored red. To me this seems incorrect as it implies, or at least misleadingly suggests, that there was no space between "not" and "to" in the original file. (Even though in that case the output is actually "nottoto be" with the "notto" in red and "to" in green.) If instead I start with a file with contents "to be", which I then change to "not to be", then the output is as expected: not to be (First three letters colored green.) Am I correct in seeing this as a bug? If so, any tips on what parts of diff.c to look at when starting a patch? Thanks, Simeon ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <198be0ca-4768-d3ce-caf3-bce4b1bb1e49@gmail.com>]
* Re: Potential bug in --color-words output [not found] ` <198be0ca-4768-d3ce-caf3-bce4b1bb1e49@gmail.com> @ 2022-11-07 23:23 ` Stefan Beller 0 siblings, 0 replies; 4+ messages in thread From: Stefan Beller @ 2022-11-07 23:23 UTC (permalink / raw) To: Simeon Krastnikov, Git Mailing List; +Cc: gitster, phillip.wood Hi, in my copy of git there is: ---- git % git grep --line-number --no-color color-words Documentation/RelNotes/1.5.4.txt:377: * "git diff --color-words" colored context lines in a wrong color. Documentation/RelNotes/1.6.2.txt:100:* The definition of what constitutes a word for "git diff --color-words" Documentation/RelNotes/1.6.5.3.txt:23: * "git diff --color-words -U0" didn't work correctly. Documentation/RelNotes/1.7.2.txt:64: * "git diff --word-diff=<mode>" extends the existing "--color-words" Documentation/RelNotes/1.7.2.txt:141: * "git diff --graph" works better with "--color-words" and other options Documentation/diff-options.txt:475:--color-words[=<regex>]:: contrib/completion/git-completion.bash:1718: --no-color --color-words --no-renames --check contrib/diff-highlight/README:9:You can use "--color-words" to highlight only the changed portions of diff.c:1914: * '--color-words' algorithm can be described as: diff.c:1929: * For '--graph' to work with '--color-words', we need to output the graph prefix diff.c:2138:/* In "color-words" mode, show word-diff of words accumulated in the buffer */ diff.c:5566: OPT_CALLBACK_F(0, "color-words", options, N_("<regex>"), gitk-git/gitk:207: "--color-words*" - "--word-diff=color" { t/t4034-diff-words.sh:70: word_diff --color-words t/t4034-diff-words.sh:100: word_diff --color-words && t/t4034-diff-words.sh:180: word_diff --color-words --unified=0 t/t4034-diff-words.sh:185: word_diff --color-words="[a-z]+" t/t4034-diff-words.sh:190: word_diff --color-words="[a-z${LF}]*" t/t4034-diff-words.sh:203: word_diff --color-words="[a-z]+" t/t4034-diff-words.sh:208: word_diff --color-words t/t4034-diff-words.sh:217: word_diff --color-words="[a-z]+" t/t4034-diff-words.sh:240: word_diff --color-words t/t4034-diff-words.sh:262: word_diff --color-words t/t4034-diff-words.sh:278: word_diff --color-words="a+" t/t4034-diff-words.sh:294: word_diff --color-words=. ---- So there are tests in t/t4034-diff-words.sh and probably around diff.c:1914- diff.c:2138 Best, Stefan Am Mo., 7. Nov. 2022 um 12:43 Uhr schrieb Simeon Krastnikov <skrastnikov@gmail.com>: > > Hi Junio, Phillip, and Stefan, > > I apologize for forwarding this to you personally, but based on the recent commit history, I figured that you would be the right persons to ask about this. > > Sorry again, if you consider this to be spam. > > Best, > > Simeon > > > > -------- Forwarded Message -------- > Subject: Potential bug in --color-words output > Date: Fri, 28 Oct 2022 23:08:09 +0200 > From: Simeon Krastnikov <skrastnikov@gmail.com> > To: git@vger.kernel.org > > > Hello, > > Given an initial file with the contents "not to be", which I then change to "to be", the output of 'git diff --color-words', is > > notto be > > with the first three letters colored red. To me this seems incorrect as it implies, or at least misleadingly suggests, that there was no space between "not" and "to" in the original file. (Even though in that case the output is actually "nottoto be" with the "notto" in red and "to" in green.) > > If instead I start with a file with contents "to be", which I then change to "not to be", then the output is as expected: > > not to be > > (First three letters colored green.) > > Am I correct in seeing this as a bug? If so, any tips on what parts of diff.c to look at when starting a patch? > > Thanks, > > Simeon > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Potential bug in --color-words output 2022-10-28 21:08 Potential bug in --color-words output Simeon Krastnikov [not found] ` <198be0ca-4768-d3ce-caf3-bce4b1bb1e49@gmail.com> @ 2022-11-08 7:27 ` Johannes Sixt 2022-11-10 13:51 ` Johannes Schindelin 2 siblings, 0 replies; 4+ messages in thread From: Johannes Sixt @ 2022-11-08 7:27 UTC (permalink / raw) To: Simeon Krastnikov; +Cc: git Am 28.10.22 um 23:08 schrieb Simeon Krastnikov: > Hello, > > Given an initial file with the contents "not to be", which I then change > to "to be", the output of 'git diff --color-words', is > > notto be > > with the first three letters colored red. To me this seems incorrect as > it implies, or at least misleadingly suggests, that there was no space > between "not" and "to" in the original file. (Even though in that case > the output is actually "nottoto be" with the "notto" in red and "to" in > green.) > > If instead I start with a file with contents "to be", which I then > change to "not to be", then the output is as expected: > > not to be > > (First three letters colored green.) > > Am I correct in seeing this as a bug? If so, any tips on what parts of > diff.c to look at when starting a patch? Well, not really. When you have a file with Line one. Line two. then change it to Line ONE. Line TWO. then --color-words currently prints it as Line one.ONE. Line two.TWO. because it does not print the whitespace after[*] a sequence of deleted words. But if it were printed, we would see Line one. ONE. Line two. TWO. That is considered inferior; hence, it isn't printed. The current algorithm produces sensible output in the vast majority of cases while also being fairly straight-forward. To make it work "better" (for some definition of that word) in the borderline cases, the algorithm would have to be made considerably more sophisticated. [*] It might be whitespace before a sequence of words, but that does not change the gist of the argument. -- Hannes ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Potential bug in --color-words output 2022-10-28 21:08 Potential bug in --color-words output Simeon Krastnikov [not found] ` <198be0ca-4768-d3ce-caf3-bce4b1bb1e49@gmail.com> 2022-11-08 7:27 ` Johannes Sixt @ 2022-11-10 13:51 ` Johannes Schindelin 2 siblings, 0 replies; 4+ messages in thread From: Johannes Schindelin @ 2022-11-10 13:51 UTC (permalink / raw) To: Simeon Krastnikov; +Cc: git [-- Attachment #1: Type: text/plain, Size: 1143 bytes --] Hi Simeon, On Fri, 28 Oct 2022, Simeon Krastnikov wrote: > Given an initial file with the contents "not to be", which I then change to > "to be", the output of 'git diff --color-words', is > > notto be > > with the first three letters colored red. This seems to be a limitation of the original `--color-words` design. FWIW I find myself using these three aliases all the time now, i.e. using a different word-coloring mode that works much better for me: [alias] cwdiff = diff --color-words=\"[A-Za-z0-9_]+|.\" cwlog = log --color-words=\"[A-Za-z0-9_]+|.\" cwshow = show --color-words=\"[A-Za-z0-9_]+|.\" Explanation: the regex `[A-Za-z0-9_]+|.` treats runs of alphanumerical characters (including the underscore) as words, and also treats any individual characters that do not fall into that category as "words". The only downside of this is that the coloring does not help identifying white-space differences because only the foreground color is changed (which does not result in any visible difference for white-space characters). Maybe these aliases help you, too? Ciao, Johannes ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-11-10 13:51 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-10-28 21:08 Potential bug in --color-words output Simeon Krastnikov [not found] ` <198be0ca-4768-d3ce-caf3-bce4b1bb1e49@gmail.com> 2022-11-07 23:23 ` Stefan Beller 2022-11-08 7:27 ` Johannes Sixt 2022-11-10 13:51 ` Johannes Schindelin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).