git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael J Gruber <git@drmicha.warpmail.net>
To: Scott Johnson <scottj75074@yahoo.com>
Cc: git@vger.kernel.org, trast@student.ethz.ch
Subject: Re: html userdiff is not showing all my changes
Date: Wed, 15 Dec 2010 10:06:21 +0100	[thread overview]
Message-ID: <4D08850D.3010402@drmicha.warpmail.net> (raw)
In-Reply-To: <561247.22837.qm@web110707.mail.gq1.yahoo.com>

Scott Johnson venit, vidit, dixit 15.12.2010 04:47:
> I am attempting to do a word diff of an html source file. Part of the removed 
> html is disappearing from the diff when I enable the fancy html word diff.
> 
> Here's the output from basic `git diff`:
> diff --git a/adv_layout_source.html b/adv_layout_source.html
> index 18a81dd..c4ed609 100644
> --- a/adv_layout_source.html
> +++ b/adv_layout_source.html
> @@ -42,8 +42,8 @@
>        <ul>
>          <li class="ydn-patterns"><em></em><a href="#">ydn-patterns</a></li>
>          <li class="ydn-mail"><em></em><a href="#">ydn-mail</a></li>
> -        <li class="yws-maps"><em></em><a href="#">yws-maps</a></li>
> -        <li class="ydn-delicious"><em></em><a href="#">ydn-delicious</a></li>
> +        <li><em></em><a href="#">yws-maps</a></li>
> +        <li><em></em><a href="#">ydn-delicious</a></li>
>          <li class="yws-flickr"><em></em><a href="#">yws-flickr</a></li>
>          <li class="yws-events"><em></em><a href="#">yws-events</a></li>
>        </ul>
> 
> 
> Here's the default `git diff --word-diff`:
> diff --git a/adv_layout_source.html b/adv_layout_source.html
> index 18a81dd..c4ed609 100644
> --- a/adv_layout_source.html
> +++ b/adv_layout_source.html
> @@ -42,8 +42,8 @@
>       <ul>
>         <li class="ydn-patterns"><em></em><a href="#">ydn-patterns</a></li>
>         <li class="ydn-mail"><em></em><a href="#">ydn-mail</a></li>
>         [-<li class="yws-maps"><em></em><a-]{+<li><em></em><a+} 
> href="#">yws-maps</a></li>
>         [-<li class="ydn-delicious"><em></em><a-]{+<li><em></em><a+} 
> href="#">ydn-delicious</a></li>
>         <li class="yws-flickr"><em></em><a href="#">yws-flickr</a></li>
>         <li class="yws-events"><em></em><a href="#">yws-events</a></li>
>       </ul>
> 
> Which is correct, but less than ideal because it highlights much more than the 
> actual changes.
> 
> So I create a .gitattributes file with one line:
> *.html diff=html
> 
> And rerun `git diff --word-diff`:
> diff --git a/adv_layout_source.html b/adv_layout_source.html
> index 18a81dd..c4ed609 100644
> --- a/adv_layout_source.html
> +++ b/adv_layout_source.html
> @@ -42,8 +42,8 @@
>       <ul>
>         <li class="ydn-patterns"><em></em><a href="#">ydn-patterns</a></li>
>         <li class="ydn-mail"><em></em><a href="#">ydn-mail</a></li>
>         <li[-class="yws-maps"-]><em></em><a href="#">yws-maps</a></li>
>         <li><em></em><a href="#">ydn-delicious</a></li>
>         <li class="yws-flickr"><em></em><a href="#">yws-flickr</a></li>
>         <li class="yws-events"><em></em><a href="#">yws-events</a></li>
>       </ul>
> 
> Yikes! What happened to the second line of changes? The removed code is not 
> displayed at all.
> 
> This is running git 1.7.3.3.
> 
> I suspect the problem is in the html patterns in userdiff.c, but I don't 
> understand the word-diff-regex well enough to fix it.

The wordRegex should really only control what comprises a word, i.e. the
granularity of --word-diff. (Where do we insert additional line-breaks
before running ordinary diff?)

If a wordRegex can make parts of diff disappear than there is problem
deeper in the diff machinery. Can you trim this down to a minimal example?

Michael

  reply	other threads:[~2010-12-15  9:08 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-15  3:47 html userdiff is not showing all my changes Scott Johnson
2010-12-15  9:06 ` Michael J Gruber [this message]
2010-12-15  9:12   ` Matthijs Kooijman
2010-12-15  9:29     ` Michael J Gruber
2010-12-15 15:13 ` [PATCH 0/4] --word-regex sanity checking and such Thomas Rast
2010-12-15 15:13   ` [PATCH 1/4] diff.c: pass struct diff_words into find_word_boundaries Thomas Rast
2010-12-15 15:13   ` [PATCH 2/4] diff.c: implement a sanity check for word regexes Thomas Rast
2010-12-15 15:13   ` [PATCH 3/4] userdiff: fix typo in ruby word regex Thomas Rast
2010-12-15 15:13   ` [PATCH 4/4] t4034: bulk verify builtin word regex sanity Thomas Rast
     [not found]   ` <913156.57703.qm@web110711.mail.gq1.yahoo.com>
2010-12-15 19:51     ` [PATCH 0/4] --word-regex sanity checking and such Thomas Rast
2010-12-15 20:48       ` Scott Johnson
2010-12-18 16:17         ` [PATCH v2 " Thomas Rast
2010-12-18 16:17           ` [PATCH v2 1/4] diff.c: pass struct diff_words into find_word_boundaries Thomas Rast
2010-12-18 16:17           ` [PATCH v2 2/4] diff.c: implement a sanity check for word regexes Thomas Rast
2010-12-18 21:00             ` Junio C Hamano
2010-12-19  1:59               ` Thomas Rast
2010-12-18 16:17           ` [PATCH v2 3/4] userdiff: fix typo in ruby and python " Thomas Rast
2010-12-18 21:02             ` Junio C Hamano
2010-12-19  2:10               ` Thomas Rast
2010-12-18 16:17           ` [PATCH v2 4/4] t4034: bulk verify builtin word regex sanity Thomas Rast
2011-01-11 21:47             ` [RFC/PATCH 0/3] " Jonathan Nieder
2011-01-11 21:48               ` [PATCH 1/3] " Jonathan Nieder
2011-01-18 18:00                 ` Re*: " Junio C Hamano
2011-01-11 21:48               ` [PATCH 2/3] userdiff: simplify word-diff safeguard Jonathan Nieder
2011-01-11 21:49               ` [PATCH 3/3] t4034 (diff --word-diff): style suggestions Jonathan Nieder
2010-12-18 16:24           ` [PATCH v2 0/4] --word-regex sanity checking and such Thomas Rast
2010-12-18 20:48             ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D08850D.3010402@drmicha.warpmail.net \
    --to=git@drmicha.warpmail.net \
    --cc=git@vger.kernel.org \
    --cc=scottj75074@yahoo.com \
    --cc=trast@student.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).