From: "Michał Kiedrowicz" <michal.kiedrowicz@gmail.com>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>
Subject: Re: [PATCH v2 7/8] gitweb: Highlight interesting parts of diff
Date: Thu, 29 Mar 2012 21:59:22 +0200 [thread overview]
Message-ID: <20120329215922.606f54d4@gmail.com> (raw)
In-Reply-To: <201203292142.21680.jnareb@gmail.com>
Jakub Narebski <jnareb@gmail.com> wrote:
> On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:
>
> > Reading diff output is sometimes very hard, even if it's colored,
> > especially if lines differ only in few characters. This is often true
> > when a commit fixes a typo or renames some variables or functions.
> >
> > This commit teaches gitweb to highlight characters that are different
> > between old and new line with a light green/red background. This should
> > work in the similar manner as in Trac or GitHub.
> >
> > The algorithm that compares lines is based on contrib/diff-highlight.
> > Basically, it works by determining common prefix/suffix of corresponding
> > lines and highlightning only the middle part of lines. For more
> > information, see contrib/diff-highlight/README.
> >
> Nice description.
>
> > Combined diffs are not supported but a following commit will change it.
> >
> O.K.
>
> > Two changes in format_diff_line() were needed to allow diff refinement
> > highlightning to work. Firstly, format_diff_line() was taught to not
> > esc_html() line that was passed as a reference. This was needed because
> > refining the highlight must be performed on unescaped lines and it uses
> > a <span> element to mark interesting parts of the line.
>
> Well, actually we just need to make format_diff_line to not esc_html
> passed line which is already esc_html'ed or esc_html_hl_regions'ed,
> i.e. to avoid double escaping. Passing line as reference if it is
> to be treated as HTML is one possibility, and I think it is a good
> convention to start to use.
>
> > Secondly, the
> > lines are untabified before comparing because calling untabify()
> > after inserting <span>'s could improperly convert tabs to spaces.
>
> Well, it is actually because untabify() must work on original text to
> work correctly, i.e. to convert tabs to spaces according to tab stops.
> It must precede esc_html'ing, and even more esc_html_hl_regions'ing.
>
> >
> > Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
>
> Anyway this is a good change, much cleaner than previous version
>
> Acked-by: Jakub Narębski <jnareb@gmail.com>
Thanks.
>
> > ---
> > gitweb/gitweb.perl | 84 ++++++++++++++++++++++++++++++++++++++++++----
> > gitweb/static/gitweb.css | 8 ++++
> > 2 files changed, 85 insertions(+), 7 deletions(-)
> >
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > index db32588..872ba12 100755
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -2426,14 +2426,14 @@ sub format_cc_diff_chunk_header {
> > }
> >
> > # process patch (diff) line (not to be used for diff headers),
> > -# returning HTML-formatted (but not wrapped) line
> > +# returning HTML-formatted (but not wrapped) line.
> > +# If the line is already esc_html()'ed, pass it as a reference.
>
> Errr... I think the explanation here must be in reverse:
>
> +# If the line is passed as a reference, it is treated as HTML,
> +# and not esc_html()'ed.
Yeah, I thought about that option too :).
>
> > sub format_diff_line {
> > my ($line, $diff_class, $from, $to) = @_;
> >
> > - chomp $line;
> > - $line = untabify($line);
> > -
> > - if ($from && $to && $line =~ m/^\@{2} /) {
> > + if (ref($line)) {
> > + $line = $$line;
> > + } elsif ($from && $to && $line =~ m/^\@{2} /) {
> > $line = format_unidiff_chunk_header($line, $from, $to);
> > } elsif ($from && $to && $line =~ m/^\@{3}/) {
> > $line = format_cc_diff_chunk_header($line, $from, $to);
> > @@ -5054,10 +5054,80 @@ sub print_inline_diff_lines {
> > print foreach (@$add);
> > }
> >
> > +# Format removed and added line, mark changed part and HTML-format them.
> > +# Impementation is based on contrib/diff-highlight
> > +sub format_rem_add_line {
>
> Wouldn't a better name be format_rem_add_pair() or format_rem_add_lines(),
> or format_rem_add_lines_pair()?
OK.
>
> > + my ($rem, $add) = @_;
> > + my @rem = split(//, $rem);
> > + my @add = split(//, $add);
> > + my ($esc_rem, $esc_add);
> > + # Ignore +/- character, thus $prefix_len is set to 1.
> > + my ($prefix_len, $suffix_len) = (1, 0);
>
> I wonder if it wouldn't be slightly cleaner to count $prefix_len from
> +/- rather than start of line. It means
>
> + # Prefix does not count initial +/- character.
> + my ($prefix_len, $suffix_len) = (0, 0);
>
> Perhaps even remove initial +/- from @add and @rem.
>
> + shift @rem;
> + shift @add;
>
> Ehh... now I see that starting $prefix_len with 1 might be not so bad
> idea after all.
>
> > + my ($prefix_is_space, $suffix_is_space) = (1, 1);
>
> It is not entirely true: $prefix_is_space is really $prefix_is_space_or_empty.
> It might be a better idea to use
>
> + my ($prefix_has_nonspace, $suffix_has_nonspace);
>
> using the fact that undefined value is false.
>
> > +
> > + while ($prefix_len < @rem && $prefix_len < @add) {
> > + last if ($rem[$prefix_len] ne $add[$prefix_len]);
> > +
> > + $prefix_is_space = 0 if ($rem[$prefix_len] !~ /\s/);
>
> + $prefix_has_nonspace = 1 if ($rem[$prefix_len] =~ /\s/);
>
> > + $prefix_len++;
> > + }
> > +
> > + my $shorter = (@rem < @add) ? @rem : @add;
> > + while ($prefix_len + $suffix_len < $shorter) {
> > + last if ($rem[-1 - $suffix_len] ne $add[-1 - $suffix_len]);
> > +
> > + $suffix_is_space = 0 if ($rem[-1 - $suffix_len] !~ /\s/);
>
> + $suffix_has_nonspace = 1 if ($rem[-1 - $suffix_len] =~ /\s/);
>
> > + $suffix_len++;
> > + }
> > +
> > + # Mark lines that are different from each other, but have some common
> > + # part that isn't whitespace. If lines are completely different, don't
> > + # mark them because that would make output unreadable, especially if
> > + # diff consists of multiple lines.
> > + if (($prefix_len == 1 && $suffix_len == 0) ||
> > + ($prefix_is_space && $suffix_is_space)) {
>
> Actually ($prefix_is_space && $suffix_is_space) is enough, but cryptic.
Yes, you caught was I thought too but hadn't wrote explicitly. The
first condition ($prefix_len == 1 && $suffix_len == 0) is redundant but
makes the condition easier to understand.
> With $prefix_is_space (== $prefix_is_space_or_empty) -> $prefix_has_nonspace
> it would be
>
> + if (not ($prefix_has_nonspace || $suffix_has_nonspace)) {
>
> Anyway, it is not as if it is not clear enough.
I may play with this for a while to see which version looks best.
>
> > + $esc_rem = esc_html($rem, -nbsp=>1);
> > + $esc_add = esc_html($add, -nbsp=>1);
> > + } else {
> > + $esc_rem = esc_html_hl_regions($rem, 'marked', [$prefix_len, @rem - $suffix_len], -nbsp=>1);
> > + $esc_add = esc_html_hl_regions($add, 'marked', [$prefix_len, @add - $suffix_len], -nbsp=>1);
>
> + $esc_rem = esc_html_hl_regions($rem, 'marked', [$prefix_len+1, @rem+1 - $suffix_len], -nbsp=>1);
> + $esc_add = esc_html_hl_regions($add, 'marked', [$prefix_len+1, @add+1 - $suffix_len], -nbsp=>1);
>
> So probably not worth it.
>
> > + }
> > +
> > + return format_diff_line(\$esc_rem, 'rem'),
> > + format_diff_line(\$esc_add, 'add');
>
> + return format_diff_line(\$esc_rem, 'rem'),
> + format_diff_line(\$esc_add, 'add');
>
> > +}
> > +
> > +# HTML-format diff context, removed and added lines.
> > +sub format_ctx_rem_add_lines {
> > + my ($ctx, $rem, $add, $is_combined) = @_;
> > + my (@new_ctx, @new_rem, @new_add);
> > +
> > + # Highlight if every removed line has a corresponding added line.
> > + # Combined diffs are not supported ATM.
>
> ATM == at this moment? Please say so.
OK.
>
> > + if (!$is_combined && @$add > 0 && @$add == @$rem) {
> > + for (my $i = 0; $i < @$add; $i++) {
> > + my ($line_rem, $line_add) = format_rem_add_line(
> > + $rem->[$i], $add->[$i]);
>
> + my ($line_rem, $line_add) =
> + format_rem_add_line($rem->[$i], $add->[$i]);
>
> > + push @new_rem, $line_rem;
> > + push @new_add, $line_add;
> > + }
> > + } else {
> > + @new_rem = map { format_diff_line($_, 'rem') } @$rem;
> > + @new_add = map { format_diff_line($_, 'add') } @$add;
> > + }
> > +
> > + @new_ctx = map { format_diff_line($_, 'ctx') } @$ctx;
> > +
> > + return (\@new_ctx, \@new_rem, \@new_add);
> > +}
>
> Nice.
>
> > +
> > # Print context lines and then rem/add lines.
> > sub print_diff_lines {
> > my ($ctx, $rem, $add, $diff_style, $is_combined) = @_;
> >
> > + ($ctx, $rem, $add) = format_ctx_rem_add_lines($ctx, $rem, $add,
> > + $is_combined);
> > +
>
> + ($ctx, $rem, $add) = format_ctx_rem_add_lines($ctx, $rem, $add, $is_combined);
> +
>
> Unless the line is too long.
>
> > if ($diff_style eq 'sidebyside' && !$is_combined) {
> > print_sidebyside_diff_lines($ctx, $rem, $add);
> > } else {
> > @@ -5089,11 +5159,11 @@ sub print_diff_chunk {
> > foreach my $line_info (@chunk) {
> > my ($class, $line) = @$line_info;
> >
> > - $line = format_diff_line($line, $class, $from, $to);
> > + $line = untabify($line);
>
> O.K. you move untabify() out of format_diff_line(), and must now
> ensure that caller uses it before calling format_diff_line() or
> format_ctx_rem_add_lines() (which uses format_diff_line() itself).
>
> I wonder if leaving untabify() (and chomp) in format_diff_line(),
> but not running it if passed reference, and running untabify() in
> format_ctx_rem_add_lines() wouldn't be a better, less fragile code.
>
I can try that.
> >
> > # print chunk headers
> > if ($class && $class eq 'chunk_header') {
> > - print $line;
> > + print format_diff_line($line, $class, $from, $to);
>
> O.K., only 'chunk_header' are not formatted.
>
> > next;
> > }
>
> [I have to take a look how final version of print_diff_lines() looks
> like in this commit; the patch alone is not enough]
>
> Right, so we format just before printing, and print_* do formatting
> itself before printing. Nice and clean.
Thanks.
>
> >
> > diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css
> > index c530355..3c4a3c9 100644
> > --- a/gitweb/static/gitweb.css
> > +++ b/gitweb/static/gitweb.css
> > @@ -438,6 +438,10 @@ div.diff.add {
> > color: #008800;
> > }
> >
> > +div.diff.add span.marked {
> > + background-color: #77ff77;
> > +}
> > +
> > div.diff.from_file a.path,
> > div.diff.from_file {
> > color: #aa0000;
> > @@ -447,6 +451,10 @@ div.diff.rem {
> > color: #cc0000;
> > }
> >
> > +div.diff.rem span.marked {
> > + background-color: #ff7777;
> > +}
> > +
> > div.diff.chunk_header a,
> > div.diff.chunk_header {
> > color: #990099;
> > --
>
> Nice styling.
>
next prev parent reply other threads:[~2012-03-29 19:59 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-23 22:56 [PATCH v2 0/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
2012-03-23 22:56 ` [PATCH v2 1/8] gitweb: esc_html_hl_regions(): Don't create empty <span> elements Michał Kiedrowicz
2012-03-24 18:58 ` Jakub Narebski
2012-03-24 23:38 ` Michał Kiedrowicz
2012-03-29 18:04 ` [PATCH] gitweb: Use descriptive names in esc_html_hl_regions() Michał Kiedrowicz
2012-03-23 22:56 ` [PATCH v2 2/8] gitweb: Pass esc_html_hl_regions() options to esc_html() Michał Kiedrowicz
2012-03-24 19:15 ` Jakub Narebski
2012-03-24 23:31 ` Michał Kiedrowicz
2012-03-23 22:56 ` [PATCH v2 3/8] gitweb: Extract print_sidebyside_diff_lines() Michał Kiedrowicz
2012-03-28 14:33 ` Jakub Narebski
2012-03-29 17:25 ` Michał Kiedrowicz
2012-03-30 13:37 ` Jakub Narebski
2012-03-23 22:56 ` [PATCH v2 4/8] gitweb: Use print_diff_chunk() for both side-by-side and inline diffs Michał Kiedrowicz
2012-03-28 15:56 ` Jakub Narebski
2012-03-29 17:31 ` Michał Kiedrowicz
2012-03-30 13:34 ` Jakub Narebski
2012-03-30 13:37 ` Michal Kiedrowicz
2012-03-23 22:56 ` [PATCH v2 5/8] gitweb: Move HTML-formatting diff line back to process_diff_line() Michał Kiedrowicz
2012-03-29 16:14 ` Jakub Narebski
2012-03-29 16:49 ` Jakub Narebski
2012-03-29 17:36 ` Michał Kiedrowicz
2012-03-23 22:56 ` [PATCH v2 6/8] gitweb: Push formatting diff lines to print_diff_chunk() Michał Kiedrowicz
2012-03-29 16:59 ` Jakub Narebski
2012-03-29 17:41 ` Michał Kiedrowicz
2012-03-23 22:56 ` [PATCH v2 7/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
2012-03-29 19:42 ` Jakub Narebski
2012-03-29 19:59 ` Michał Kiedrowicz [this message]
2012-03-23 22:56 ` [PATCH v2 8/8] gitweb: Refinement highlightning in combined diffs Michał Kiedrowicz
2012-03-29 23:37 ` Jakub Narebski
2012-03-30 6:49 ` Michal Kiedrowicz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120329215922.606f54d4@gmail.com \
--to=michal.kiedrowicz@gmail.com \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).