All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Jean-Baptiste Quenot" <jbq@caraldi.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] Do not chop HTML tags in commit search result
Date: Wed, 13 Feb 2008 11:43:14 -0800	[thread overview]
Message-ID: <7vbq6kprql.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <ae63f8b50802130937mddf9df9re2a95bee44661ee3@mail.gmail.com> (Jean-Baptiste Quenot's message of "Wed, 13 Feb 2008 18:37:24 +0100")

"Jean-Baptiste Quenot" <jbq@caraldi.com> writes:

> ... I encountered an annoying bug
> with gitweb 1.5.4.1, when searching for commits, if the search string
> is too long, the generated HTML is munged leading to an ill-formed
> XHTML document.

> Here is the patch, hope it helps:
>
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index ae2d057..2c0b990 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -3780,7 +3780,10 @@ sub git_search_grep_body {
>                                 my $trail = esc_html($3) || "";
>                                 $trail = chop_str($trail, 30, 10);
> ...
>                                 my $text = "$lead<span
> class=\"match\">$match</span>$trail";
> -                               print chop_str($text, 80, 5) . "<br/>\n";

I think esc_html() and chop_str() are backwards here.  If $3 is
overlong it is cut in the middle of some markup.  Even though
chop_str() claims to be "HTML aware", I do not think it is.  It
seems to know about "&entities;" but not mark-ups.

There are quite a many instances of esc_html() first then chop_str()
in that function, and I think they all deserve to be fixed.

	my $comment = $co{'comment'};
	foreach my $line (@$comment) {
		if ($line =~ m/^(.*)($search_regexp)(.*)$/i) {
			my $lead = esc_html($1) || "";
			$lead = chop_str($lead, 30, 10);
			my $match = esc_html($2) || "";
			my $trail = esc_html($3) || "";
			$trail = chop_str($trail, 30, 10);
			my $text = "$lead<span class=\"match\">$match</span>$trail";
			print chop_str($text, 80, 5) . "<br/>\n";
		}
	}

I think this is trying to fit the result on a line, showing the
match sandwitched by not-too-long part taken from leading and
trailing context ($lead and $trail can be chomped aggressively
than $match).  But $lead and $trail are escaped then chomped
which is already wrong.

I think the body of that if() would be better written like this:

	my ($lead, $match, $trail) = ($1, $2, $3);
	$match = chop_str($match, 70, $slop); # in case it is very long...
	$contextlen = (80 - len($match)) / 2; # and the remainder...
        if ($contextlen > 30) { $contextlen = 30 }; # but not too much
        $trail = chop_str($trail, $contextlen, $slop);
        $lead = chop_str($lead, $contextlen, $slop);

	$lead = esc_html($lead);
	$match = esc_html($match);
	$trail = esc_html($trail);

        print "$lead<span ...>$match</span>$trail";

  parent reply	other threads:[~2008-02-13 19:44 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-13 17:37 [PATCH] Do not chop HTML tags in commit search result Jean-Baptiste Quenot
2008-02-13 19:16 ` Jakub Narebski
2008-02-13 19:43 ` Junio C Hamano [this message]
2008-02-22 16:33   ` [PATCH] gitweb: Better chopping in commit search results Jakub Narebski
2008-02-22 17:14     ` Junio C Hamano
2008-02-22 17:49       ` Jakub Narebski
2008-02-22 19:14         ` Jakub Narebski
2008-02-23 21:44           ` [RFC/PATCH] gitweb: Option to chop at beginning and in the middle in chop_str Jakub Narebski
2008-02-23 22:04           ` [PATCH] gitweb: Better chopping in commit search results Junio C Hamano
2008-02-23 23:36             ` Jakub Narebski
2008-02-24 13:01           ` [RFC/PATCH v2] gitweb: Option to chop at beginning and in the middle in chop_str Jakub Narebski
2008-02-25  1:46             ` Junio C Hamano
2008-02-25 20:07               ` [RFC/PATCH v3] gitweb: Better cutting matched string and its context Jakub Narebski
2008-02-25 20:18                 ` Junio C Hamano
2008-02-23  9:27     ` [PATCH] gitweb: Better chopping in commit search results Karl Hasselström
2008-02-23 10:20       ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vbq6kprql.fsf@gitster.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jbq@caraldi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.