From: Jakub Narebski <jnareb@gmail.com>
To: "Christopher M. Fuhrman" <cfuhrman@panix.com>
Cc: gitster@pobox.com, git@vger.kernel.org, cwilson@cdwilson.us,
sylvain@abstraction.fr
Subject: Re: [PATCH] gitweb: highlight: strip non-printable characters via col(1)
Date: Fri, 26 Aug 2011 21:54:13 +0200 [thread overview]
Message-ID: <201108262154.14493.jnareb@gmail.com> (raw)
In-Reply-To: <1314053923-13122-1-git-send-email-cfuhrman@panix.com>
On Tue, 23 Aug 2011, Christopher M. Fuhrman wrote:
> The current code, as is, passes control characters, such as form-feed
> (^L) to highlight which then passes it through to the browser. This
> will cause the browser to display one of the following warnings:
>
> Safari v5.1 (6534.50) & Google Chrome v13.0.782.112:
>
> This page contains the following errors:
>
> error on line 657 at column 38: PCDATA invalid Char value 12
> Below is a rendering of the page up to the first error.
>
> Mozilla Firefox 3.6.19 & Mozilla Firefox 5.0:
>
> XML Parsing Error: not well-formed
> Location:
> http://path/to/git/repo/blah/blah
>
> Both errors were generated by gitweb.perl v1.7.3.4 w/ highlight 2.7
> using arch/ia64/kernel/unwind.c from the Linux kernel.
>
> Strip non-printable control-characters by piping the output produced
> by git-cat-file(1) to col(1) as follows:
>
> git cat-file blob deadbeef314159 | col -bx | highlight <args>
>
> Note usage of the '-x' option which tells col(1) to output multiple
> spaces instead of tabs.
Why use external program (which ming be not installed, or might not
strip control-characters), instead of making gitweb sanitize highlighter
output itself. Something like the patch below (which additionally
shows where there are control characters):
-- >8 --
diff --git i/gitweb/gitweb.perl w/gitweb/gitweb.perl
index 7cf12af..192db2c 100755
--- i/gitweb/gitweb.perl
+++ w/gitweb/gitweb.perl
@@ -1517,6 +1517,17 @@ sub esc_path {
return $str;
}
+# Sanitize for use in XHTML + application/xml+xhtml
+sub sanitize {
+ my $str = shift;
+
+ return undef unless defined $str;
+
+ $str = to_utf8($str);
+ $str =~ s|([[:cntrl:]])|quot_cec($1)|eg;
+ return $str;
+}
+
# Make control characters "printable", using character escape codes (CEC)
sub quot_cec {
my $cntrl = shift;
@@ -6546,7 +6557,8 @@ sub git_blob {
$nr++;
$line = untabify($line);
printf qq!<div class="pre"><a id="l%i" href="%s#l%i" class="linenr">%4i</a> %s</div>\n!,
- $nr, esc_attr(href(-replay => 1)), $nr, $nr, $syntax ? to_utf8($line) : esc_html($line, -nbsp=>1);
+ $nr, esc_attr(href(-replay => 1)), $nr, $nr,
+ $syntax ? sanitize($line) : esc_html($line, -nbsp=>1);
}
}
close $fd
-- 8< --
--
Jakub Narebski
Poland
next prev parent reply other threads:[~2011-08-26 19:54 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-22 22:58 [PATCH] gitweb: highlight: strip non-printable characters via col(1) Christopher M. Fuhrman
2011-08-22 23:21 ` Junio C Hamano
2011-08-26 19:54 ` Jakub Narebski [this message]
2011-08-26 21:44 ` Junio C Hamano
2011-08-26 22:06 ` Jakub Narebski
2011-09-16 12:41 ` [PATCH] gitweb: Strip non-printable characters from syntax highlighter output Jakub Narebski
2011-09-16 16:32 ` Junio C Hamano
2011-09-16 18:58 ` Jakub Narebski
2011-09-16 20:24 ` Junio C Hamano
2011-09-16 18:11 ` Christopher M. Fuhrman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201108262154.14493.jnareb@gmail.com \
--to=jnareb@gmail.com \
--cc=cfuhrman@panix.com \
--cc=cwilson@cdwilson.us \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=sylvain@abstraction.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.