From: Jakub Narebski <jnareb@gmail.com>
To: git@vger.kernel.org
Subject: [RFC/PATCH (do not use)] gitweb: Introduce and use esc_attr to escape values of tag attributes
Date: Fri, 22 May 2009 17:55:39 +0200 [thread overview]
Message-ID: <200905221755.40239.jnareb@gmail.com> (raw)
In-Reply-To: <200905221735.48310.jnareb@gmail.com>
CGI.pm does by default slight escaping (simple_escape from CGI::Util)
of all _attribute_ values (depending on the value of autoEscape(), by
default on).
This escaping, at least in CGI.pm version 3.10 (most current version
at CPAN is 3.43), is minimal: only '"', '&', '<' and '>' are escaped
using named HTML entity references (", &, < and >
respectively). simple_escape does not do escaping of control
characters such as ^X which are invalid in XHTML (in strict mode).
Note that IIRC escaping '<' and '>' in attributes is not strictly
necessary.
New 'esc_attr' subroutine can be used to do more complete escaping of
attribute values, including dealing with control characters (forbidden
in XHTML in strict validating mode); currently it replaces all control
characters by '?' question mark. For esc_attr() to work correctly you
have to turn off CGI.pm autoescaping to avoid double escaping.
While at it chop_and_esc_str now uses esc_attr instead of simply
replacing control characters with '?' by itself.
The problem is that when autoescaping is turned off we would have to
esc_attr all attributes containing user input (this includes result of
href() subroutine), or risk either double escaping, or not escaping
some fragment.
Alternate solution include:
* stripping of control characters before input is passed as attribute
value, as it was done in chop_and_esc_str; this would include at
minimum format_subject_html
* replacing either make_attributes from CGI.pm, or simple_escape
subrotuine it uses, making it escape characters better
Noticed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
http://thread.gmane.org/gmane.comp.version-control.git/116755/focus=117539
---
This is abandoned version. Still... perhaps esc_attr would be of use
to somebody... Also as negative example in archives.
gitweb/gitweb.perl | 39 +++++++++++++++++++++++++++++++++------
1 files changed, 33 insertions(+), 6 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 05702e4..e505f2f 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1027,7 +1027,7 @@ sub esc_param {
return $str;
}
-# quote unsafe chars in whole URL, so some charactrs cannot be quoted
+# quote unsafe chars in whole URL, so some characters cannot be quoted
sub esc_url {
my $str = shift;
$str =~ s/([^A-Za-z0-9\-_.~();\/;?:@&=])/sprintf("%%%02X", ord($1))/eg;
@@ -1036,6 +1036,26 @@ sub esc_url {
return $str;
}
+# quote and escape tag attribute values; autoEscape has to be turned off
+# otherwise you would get double escaping for '&' ("escape" character)
+sub esc_attr {
+ my $str = shift;
+ return $str unless defined $str;
+
+ my %ent = ( # named HTML entities
+ '"' => '"',
+ '&' => '&',
+ # those are not strictly necessary
+ '<' => '<',
+ '>' => '>',
+ );
+ $str = to_utf8($str);
+ $str =~ s|([\"&<>])|$ent{$1}|eg;
+ $str =~ s|([[:cntrl:]])|?|eg; # alternative would be use quot_xxx($1)
+
+ return $str;
+}
+
# replace invalid utf8 character with SUBSTITUTION sequence
sub esc_html {
my $str = shift;
@@ -1236,8 +1256,10 @@ sub chop_and_escape_str {
if ($chopped eq $str) {
return esc_html($chopped);
} else {
- $str =~ s/([[:cntrl:]])/?/g;
- return $cgi->span({-title=>$str}, esc_html($chopped));
+ my $autoescape = $cgi->autoEscape(undef);
+ my $result = $cgi->span({-title=>esc_attr($str)}, esc_html($chopped));
+ $cgi->autoEscape($autoescape); # restore original value
+ return $result;
}
}
@@ -1458,14 +1480,19 @@ sub format_subject_html {
my ($long, $short, $href, $extra) = @_;
$extra = '' unless defined($extra);
+ my $ret = '';
if (length($short) < length($long)) {
- return $cgi->a({-href => $href, -class => "list subject",
- -title => to_utf8($long)},
+ my $autoescape = $cgi->autoEscape(undef);
+ # or just replace s/([[:cntrl:]])/?/g in -title
+ $ret = $cgi->a({-href => $href, -class => "list subject",
+ -title => esc_attr($long)},
esc_html($short) . $extra);
+ $cgi->autoEscape($autoescape); # restore original value
} else {
- return $cgi->a({-href => $href, -class => "list subject"},
+ $ret = $cgi->a({-href => $href, -class => "list subject"},
esc_html($long) . $extra);
}
+ return $ret;
}
# format git diff header line, i.e. "diff --(git|combined|cc) ..."
--
1.6.3.1
prev parent reply other threads:[~2009-05-22 15:56 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-22 15:35 [PATCH] gitweb: Sanitize title attribute in format_subject_html Jakub Narebski
2009-05-22 15:39 ` [PATCH (lite)] " Jakub Narebski
2009-05-22 15:55 ` Jakub Narebski [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200905221755.40239.jnareb@gmail.com \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.