* [PATCH] gitweb: Consolidate escaping/validation of query string
@ 2006-09-23 22:18 Petr Baudis
2006-09-23 22:36 ` Jakub Narebski
0 siblings, 1 reply; 7+ messages in thread
From: Petr Baudis @ 2006-09-23 22:18 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Consider:
http://repo.or.cz/?p=glibc-cvs.git;a=tree;h=2609cb0411389325f4ee2854cc7159756eb0671e;hb=2609cb0411389325f4ee2854cc7159756eb0671e
(click on the funny =__ify file)
We ought to handle anything in filenames and I actually see no reason why
we don't, modulo very little missing escaping that this patch hopefully
also fixes.
I have also made esc_param() escape [?=&;]. Not escaping [&;] was downright
buggy and [?=] just feels better escaped. ;-) YMMV.
Signed-off-by: Petr Baudis <pasky@suse.cz>
---
gitweb/gitweb.perl | 32 +++++++++++---------------------
1 files changed, 11 insertions(+), 21 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 2c6b197..bd47985 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -212,19 +212,9 @@ if (defined $project) {
}
}
+# We have to handle those containing any characters:
our $file_name = $cgi->param('f');
-if (defined $file_name) {
- if (!validate_input($file_name)) {
- die_error(undef, "Invalid file parameter");
- }
-}
-
our $file_parent = $cgi->param('fp');
-if (defined $file_parent) {
- if (!validate_input($file_parent)) {
- die_error(undef, "Invalid file parent parameter");
- }
-}
our $hash = $cgi->param('h');
if (defined $hash) {
@@ -305,7 +295,7 @@ sub evaluate_path_info {
$action ||= "blob_plain";
}
$hash_base ||= validate_input($refname);
- $file_name ||= validate_input($pathname);
+ $file_name ||= $pathname;
} elsif (defined $refname) {
# we got "project.git/branch"
$action ||= "shortlog";
@@ -416,7 +406,7 @@ # quote unsafe chars, but keep the slash
# correct, but quoted slashes look too horrible in bookmarks
sub esc_param {
my $str = shift;
- $str =~ s/([^A-Za-z0-9\-_.~();\/;?:@&=])/sprintf("%%%02X", ord($1))/eg;
+ $str =~ s/([^A-Za-z0-9\-_.~()\/:@])/sprintf("%%%02X", ord($1))/eg;
$str =~ s/\+/%2B/g;
$str =~ s/ /\+/g;
return $str;
@@ -1289,7 +1279,7 @@ sub git_header_html {
if (defined $action) {
$title .= "/$action";
if (defined $file_name) {
- $title .= " - $file_name";
+ $title .= " - " . esc_html($file_name);
if ($action eq "tree" && $file_name !~ m|/$|) {
$title .= "/";
}
@@ -2439,7 +2429,7 @@ sub git_blame2 {
if ($ftype !~ "blob") {
die_error("400 Bad Request", "Object is not a blob");
}
- open ($fd, "-|", git_cmd(), "blame", '-l', $file_name, $hash_base)
+ open ($fd, "-|", git_cmd(), "blame", '-l', '--', $file_name, $hash_base)
or die_error(undef, "Open git-blame failed");
git_header_html();
my $formats_nav =
@@ -3081,12 +3071,12 @@ sub git_blobdiff {
if (defined $file_name) {
if (defined $file_parent) {
$diffinfo{'status'} = '2';
- $diffinfo{'from_file'} = $file_parent;
- $diffinfo{'to_file'} = $file_name;
+ $diffinfo{'from_file'} = esc_html($file_parent);
+ $diffinfo{'to_file'} = esc_html($file_name);
} else { # assume not renamed
$diffinfo{'status'} = '1';
- $diffinfo{'from_file'} = $file_name;
- $diffinfo{'to_file'} = $file_name;
+ $diffinfo{'from_file'} = esc_html($file_name);
+ $diffinfo{'to_file'} = esc_html($file_name);
}
} else { # no filename given
$diffinfo{'status'} = '2';
@@ -3135,7 +3125,7 @@ sub git_blobdiff {
-type => 'text/plain',
-charset => 'utf-8',
-expires => $expires,
- -content_disposition => qq(inline; filename="${file_name}.patch"));
+ -content_disposition => qq(inline; filename=") . quotemeta($file_name) . qq(.patch"));
print "X-Git-Url: " . $cgi->self_url() . "\n\n";
@@ -3585,7 +3575,7 @@ XML
if (!($line =~ m/^:([0-7]{6}) ([0-7]{6}) ([0-9a-fA-F]{40}) ([0-9a-fA-F]{40}) (.)([0-9]{0,3})\t(.*)$/)) {
next;
}
- my $file = validate_input(unquote($7));
+ my $file = esc_html(unquote($7));
$file = decode("utf8", $file, Encode::FB_DEFAULT);
print "$file<br/>\n";
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] gitweb: Consolidate escaping/validation of query string
2006-09-23 22:18 [PATCH] gitweb: Consolidate escaping/validation of query string Petr Baudis
@ 2006-09-23 22:36 ` Jakub Narebski
2006-09-23 22:41 ` Jakub Narebski
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Jakub Narebski @ 2006-09-23 22:36 UTC (permalink / raw)
To: git
Petr Baudis wrote:
> Consider:
>
> http://repo.or.cz/?p=glibc-cvs.git;a=tree;h=2609cb0411389325f4ee2854cc7159756eb0671e;hb=2609cb0411389325f4ee2854cc7159756eb0671e
>
> (click on the funny =__ify file)
Aaargh? Why this name?
> We ought to handle anything in filenames and I actually see no reason why
> we don't, modulo very little missing escaping that this patch hopefully
> also fixes.
>
> I have also made esc_param() escape [?=&;]. Not escaping [&;] was downright
> buggy and [?=] just feels better escaped. ;-) YMMV.
First of all, before introduction href() subroutine we escaped (using
esc_param) the _whole_ generated query string. Now we need to escape
only value part (as we can assume that parameter names are correct;
they are taken from limited pre-defined set).
I'd rather have new esc_param() or esc_param_value() quote like escape
subroutine from CGI::Util, with the esception of _not_ escaping '/'
(it makes funny bookmark, and lot less readable query string), and rename
current esc_param() to esc_query_string() or esc_params().
Perhaps we should have also esc_arg() for things like title attribute
of <a> (link) element (or other element) and filename="..." part of
Content-disposition: HTTP header.
By the way, the validate_input() should be split into separate subroutines:
validate_ref() for validating hash, hash_base, hash_parent, hash_parent_base,
and validate_path() for validating project, file_name and file_parent
parameters.
We should _never_ use esc_html except during the output, or just before output.
It certainly shouldn't take place in parse_* subroutine (or in the fake parse
like in git_blobdiff)!
P.S. gitweb was tested I think with filenames wirh spaces. Perhaps we should
add some other test file to gitweb/test, with '=', '@', '$', '?', '*', ' ', '\n'
etc.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] gitweb: Consolidate escaping/validation of query string
2006-09-23 22:36 ` Jakub Narebski
@ 2006-09-23 22:41 ` Jakub Narebski
2006-09-24 11:36 ` Petr Baudis
2006-09-24 11:39 ` Petr Baudis
2 siblings, 0 replies; 7+ messages in thread
From: Jakub Narebski @ 2006-09-23 22:41 UTC (permalink / raw)
To: git
Jakub Narebski wrote:
> I'd rather have new esc_param() or esc_param_value() quote like escape
> subroutine from CGI::Util, with the esception of _not_ escaping '/'
> (it makes funny bookmark, and lot less readable query string), and rename
> current esc_param() to esc_query_string() or esc_params().
Or just esc_url()
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] gitweb: Consolidate escaping/validation of query string
2006-09-23 22:36 ` Jakub Narebski
2006-09-23 22:41 ` Jakub Narebski
@ 2006-09-24 11:36 ` Petr Baudis
2006-09-24 12:21 ` Jakub Narebski
2006-09-24 11:39 ` Petr Baudis
2 siblings, 1 reply; 7+ messages in thread
From: Petr Baudis @ 2006-09-24 11:36 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Dear diary, on Sun, Sep 24, 2006 at 12:36:13AM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> Petr Baudis wrote:
> > (click on the funny =__ify file)
>
> Aaargh? Why this name?
You tell me... ;-)
> > I have also made esc_param() escape [?=&;]. Not escaping [&;] was downright
> > buggy and [?=] just feels better escaped. ;-) YMMV.
..snip..
> I'd rather have new esc_param() or esc_param_value() quote like escape
> subroutine from CGI::Util, with the esception of _not_ escaping '/'
> (it makes funny bookmark, and lot less readable query string), and rename
> current esc_param() to esc_query_string() or esc_params().
Huh, well, what's the point with the rename and why not keep it as it is
with just removing the four characters above? Escaped stuff looks ugly
in a URL. ;-)
BTW, looking at CGI::Util innards, what sick mind serves CGIs from an
EBCDIC machine?
> Perhaps we should have also esc_arg() for things like title attribute
> of <a> (link) element (or other element)
Yes. I wanted to implement your few months old wish to have full string
of abbreviated column contents in title attributes but delayed it for
now because we have no such function yet.
> and filename="..." part of Content-disposition: HTTP header.
This is not HTML-ish so you need quotemeta() here, using entities makes
no sense in this case.
> By the way, the validate_input() should be split into separate subroutines:
> validate_ref() for validating hash, hash_base, hash_parent, hash_parent_base,
> and validate_path() for validating project,
Yes, that would be nice.
> file_name and file_parent parameters.
What's the point in validating those?
> We should _never_ use esc_html except during the output, or just before output.
> It certainly shouldn't take place in parse_* subroutine (or in the fake parse
> like in git_blobdiff)!
Yes, I agree. Will send a fixed patch.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] gitweb: Consolidate escaping/validation of query string
2006-09-23 22:36 ` Jakub Narebski
2006-09-23 22:41 ` Jakub Narebski
2006-09-24 11:36 ` Petr Baudis
@ 2006-09-24 11:39 ` Petr Baudis
2006-09-24 12:31 ` Jakub Narebski
2 siblings, 1 reply; 7+ messages in thread
From: Petr Baudis @ 2006-09-24 11:39 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Consider:
http://repo.or.cz/?p=glibc-cvs.git;a=tree;h=2609cb0411389325f4ee2854cc7159756eb0671e;hb=2609cb0411389325f4ee2854cc7159756eb0671e
(click on the funny =__ify file)
We ought to handle anything in filenames and I actually see no reason why
we don't, modulo very little missing escaping that this patch hopefully
also fixes.
I have also made esc_param() escape [?=&;]. Not escaping [&;] was downright
buggy and [?=] just feels better escaped. ;-) YMMV.
Signed-off-by: Petr Baudis <pasky@suse.cz>
---
gitweb/gitweb.perl | 28 +++++++++-------------------
1 files changed, 9 insertions(+), 19 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 2c6b197..f3c5bd8 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -212,19 +212,9 @@ if (defined $project) {
}
}
+# We have to handle those containing any characters:
our $file_name = $cgi->param('f');
-if (defined $file_name) {
- if (!validate_input($file_name)) {
- die_error(undef, "Invalid file parameter");
- }
-}
-
our $file_parent = $cgi->param('fp');
-if (defined $file_parent) {
- if (!validate_input($file_parent)) {
- die_error(undef, "Invalid file parent parameter");
- }
-}
our $hash = $cgi->param('h');
if (defined $hash) {
@@ -305,7 +295,7 @@ sub evaluate_path_info {
$action ||= "blob_plain";
}
$hash_base ||= validate_input($refname);
- $file_name ||= validate_input($pathname);
+ $file_name ||= $pathname;
} elsif (defined $refname) {
# we got "project.git/branch"
$action ||= "shortlog";
@@ -416,7 +406,7 @@ # quote unsafe chars, but keep the slash
# correct, but quoted slashes look too horrible in bookmarks
sub esc_param {
my $str = shift;
- $str =~ s/([^A-Za-z0-9\-_.~();\/;?:@&=])/sprintf("%%%02X", ord($1))/eg;
+ $str =~ s/([^A-Za-z0-9\-_.~()\/:@])/sprintf("%%%02X", ord($1))/eg;
$str =~ s/\+/%2B/g;
$str =~ s/ /\+/g;
return $str;
@@ -1289,7 +1279,7 @@ sub git_header_html {
if (defined $action) {
$title .= "/$action";
if (defined $file_name) {
- $title .= " - $file_name";
+ $title .= " - " . esc_html($file_name);
if ($action eq "tree" && $file_name !~ m|/$|) {
$title .= "/";
}
@@ -2439,7 +2429,7 @@ sub git_blame2 {
if ($ftype !~ "blob") {
die_error("400 Bad Request", "Object is not a blob");
}
- open ($fd, "-|", git_cmd(), "blame", '-l', $file_name, $hash_base)
+ open ($fd, "-|", git_cmd(), "blame", '-l', '--', $file_name, $hash_base)
or die_error(undef, "Open git-blame failed");
git_header_html();
my $formats_nav =
@@ -3135,7 +3125,7 @@ sub git_blobdiff {
-type => 'text/plain',
-charset => 'utf-8',
-expires => $expires,
- -content_disposition => qq(inline; filename="${file_name}.patch"));
+ -content_disposition => qq(inline; filename=") . quotemeta($file_name) . qq(.patch"));
print "X-Git-Url: " . $cgi->self_url() . "\n\n";
@@ -3155,8 +3145,8 @@ sub git_blobdiff {
} else {
while (my $line = <$fd>) {
- $line =~ s!a/($hash|$hash_parent)!a/$diffinfo{'from_file'}!g;
- $line =~ s!b/($hash|$hash_parent)!b/$diffinfo{'to_file'}!g;
+ $line =~ s!a/($hash|$hash_parent)!'a/'.esc_html($diffinfo{'from_file'})!eg;
+ $line =~ s!b/($hash|$hash_parent)!'b/'.esc_html($diffinfo{'to_file'})!eg;
print $line;
@@ -3585,7 +3575,7 @@ XML
if (!($line =~ m/^:([0-7]{6}) ([0-7]{6}) ([0-9a-fA-F]{40}) ([0-9a-fA-F]{40}) (.)([0-9]{0,3})\t(.*)$/)) {
next;
}
- my $file = validate_input(unquote($7));
+ my $file = esc_html(unquote($7));
$file = decode("utf8", $file, Encode::FB_DEFAULT);
print "$file<br/>\n";
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] gitweb: Consolidate escaping/validation of query string
2006-09-24 11:36 ` Petr Baudis
@ 2006-09-24 12:21 ` Jakub Narebski
0 siblings, 0 replies; 7+ messages in thread
From: Jakub Narebski @ 2006-09-24 12:21 UTC (permalink / raw)
To: Petr Baudis, git
Petr "Pasky" Baudis wrote:
>>> I have also made esc_param() escape [?=&;]. Not escaping [&;] was downright
>>> buggy and [?=] just feels better escaped. ;-) YMMV.
> ..snip..
>> I'd rather have new esc_param() or esc_param_value() quote like escape
>> subroutine from CGI::Util, with the esception of _not_ escaping '/'
>> (it makes funny bookmark, and lot less readable query string), and rename
>> current esc_param() to esc_query_string() or esc_params().
>
> Huh, well, what's the point with the rename and why not keep it as it is
> with just removing the four characters above? Escaped stuff looks ugly
> in a URL. ;-)
There are few places where we escape whole URL (so I'd prefer esc_url() for
current implementation): esc_param($home_link) and soon esc_param($githelp_url)
(and _not_ esc_html($githelp_url)). And those URLs can contain query strings,
so we cannot escape '?', ';' and '&', '=' there.
Before introduction of href() subroutine we escaped using esc_param the whole
query string, hence esc_param did not escaped [?=&;].
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] gitweb: Consolidate escaping/validation of query string
2006-09-24 11:39 ` Petr Baudis
@ 2006-09-24 12:31 ` Jakub Narebski
0 siblings, 0 replies; 7+ messages in thread
From: Jakub Narebski @ 2006-09-24 12:31 UTC (permalink / raw)
To: git
Petr Baudis wrote:
> Consider:
>
> http://repo.or.cz/?p=glibc-cvs.git;a=tree;h=2609cb0411389325f4ee2854cc7159756eb0671e;hb=2609cb0411389325f4ee2854cc7159756eb0671e
>
> (click on the funny =__ify file)
>
> We ought to handle anything in filenames and I actually see no reason why
> we don't, modulo very little missing escaping that this patch hopefully
> also fixes.
>
> I have also made esc_param() escape [?=&;]. Not escaping [&;] was downright
> buggy and [?=] just feels better escaped. ;-) YMMV.
>
> Signed-off-by: Petr Baudis <pasky@suse.cz>
This patch contains a few unrelated changes:
* changing semantics of esc_param subroutine
* adding some esc_html where it could be needed
* removing $file_name and $file_parent validation
About change to esc_param: we need current version of esc_param, perhaps
to be named esc_url to be able to say esc_url($home_link) and soon
esc_url($githelp_url).
About adding esc_html where it could be needed: good change, see
my comments below.
About removing $file_name and $file_parent validation: those parameters
have exactly the same textual restrictions as $project parameter - they
are pathnames.
> @@ -2439,7 +2429,7 @@ sub git_blame2 {
> if ($ftype !~ "blob") {
> die_error("400 Bad Request", "Object is not a blob");
> }
> - open ($fd, "-|", git_cmd(), "blame", '-l', $file_name, $hash_base)
> + open ($fd, "-|", git_cmd(), "blame", '-l', '--', $file_name, $hash_base)
> or die_error(undef, "Open git-blame failed");
> git_header_html();
> my $formats_nav =
Slightly unrelated change. Shouldn't it be
open $fd, "-|", git_cmd(), "blame", '-l', $hash_base, "--", $file_name
by the way?
> @@ -3135,7 +3125,7 @@ sub git_blobdiff {
> -type => 'text/plain',
> -charset => 'utf-8',
> -expires => $expires,
> - -content_disposition => qq(inline; filename="${file_name}.patch"));
> + -content_disposition => qq(inline; filename=") . quotemeta($file_name) . qq(.patch"));
>
> print "X-Git-Url: " . $cgi->self_url() . "\n\n";
>
I'd check other places where we output Content-Disposition: header.
At least one place needs similar quotemeta somewhere.
> @@ -3585,7 +3575,7 @@ XML
> if (!($line =~ m/^:([0-7]{6}) ([0-7]{6}) ([0-9a-fA-F]{40}) ([0-9a-fA-F]{40}) (.)([0-9]{0,3})\t(.*)$/)) {
> next;
> }
> - my $file = validate_input(unquote($7));
> + my $file = esc_html(unquote($7));
> $file = decode("utf8", $file, Encode::FB_DEFAULT);
> print "$file<br/>\n";
> }
I'd say perhaps
my $file = unquote($7);
$file = decode("utf8", $file, Encode::FB_DEFAULT);
print esc_html($file) . "<br/>\n";
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-09-24 12:31 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-23 22:18 [PATCH] gitweb: Consolidate escaping/validation of query string Petr Baudis
2006-09-23 22:36 ` Jakub Narebski
2006-09-23 22:41 ` Jakub Narebski
2006-09-24 11:36 ` Petr Baudis
2006-09-24 12:21 ` Jakub Narebski
2006-09-24 11:39 ` Petr Baudis
2006-09-24 12:31 ` Jakub Narebski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).