* gitweb: using quotemeta
@ 2006-09-28 21:15 Luben Tuikov
2006-09-28 23:18 ` Junio C Hamano
0 siblings, 1 reply; 16+ messages in thread
From: Luben Tuikov @ 2006-09-28 21:15 UTC (permalink / raw)
To: git
Commit ab41dfbfd4f3f9fedac71550027e9813b11abe3d introduces
the use of quotemeta to quote the $filename of the snapshot.
The commit message explains:
Just in case filename contains end of line character.
But quotemeta quotes any characters not matching /A-Za-z_0-9/.
Which means that we get strings like this:
linux\-2\.6\.git\-5c2d97cb31fb77981797fec46230ca005b865799\.tar\.gz
Is this the desired behavior? FWIW, the backslash character
is not part of of the name, but ended up when the snapshot was written
to the filesystem.
Thanks,
Luben
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-09-28 21:15 gitweb: using quotemeta Luben Tuikov
@ 2006-09-28 23:18 ` Junio C Hamano
2006-09-28 23:27 ` Luben Tuikov
0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2006-09-28 23:18 UTC (permalink / raw)
To: ltuikov; +Cc: git
Luben Tuikov <ltuikov@yahoo.com> writes:
> Commit ab41dfbfd4f3f9fedac71550027e9813b11abe3d introduces
> the use of quotemeta to quote the $filename of the snapshot.
> The commit message explains:
>
> Just in case filename contains end of line character.
>
> But quotemeta quotes any characters not matching /A-Za-z_0-9/.
> Which means that we get strings like this:
>
> linux\-2\.6\.git\-5c2d97cb31fb77981797fec46230ca005b865799\.tar\.gz
>
> Is this the desired behavior? FWIW, the backslash character
> is not part of of the name, but ended up when the snapshot was written
> to the filesystem.
Ouch, that was a sloppy planning and coding, and sloppier
reviewing. Sorry.
What is the right quoting there? Just quoting double-quotes?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-09-28 23:18 ` Junio C Hamano
@ 2006-09-28 23:27 ` Luben Tuikov
2006-10-02 0:28 ` Jakub Narebski
0 siblings, 1 reply; 16+ messages in thread
From: Luben Tuikov @ 2006-09-28 23:27 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
--- Junio C Hamano <junkio@cox.net> wrote:
>
> Ouch, that was a sloppy planning and coding, and sloppier
> reviewing. Sorry.
>
> What is the right quoting there? Just quoting double-quotes?
I'm not sure. What undesired character could we have in $filename
of a snapshot? The commit ab41dfbfd4f message gives this
justification: "Just in case filename contains end of line character."
It looks like $filename is constructed by well defined strings:
basename($project), $hash and $suffix all of which should be ok.
I'd say we don't need quotemeta for $filename of snapshot.
Luben
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-09-28 23:27 ` Luben Tuikov
@ 2006-10-02 0:28 ` Jakub Narebski
2006-10-02 20:12 ` Luben Tuikov
0 siblings, 1 reply; 16+ messages in thread
From: Jakub Narebski @ 2006-10-02 0:28 UTC (permalink / raw)
To: git
Luben Tuikov wrote:
> --- Junio C Hamano <junkio@cox.net> wrote:
>>
>> Ouch, that was a sloppy planning and coding, and sloppier
>> reviewing. Sorry.
>>
>> What is the right quoting there? Just quoting double-quotes?
>
> I'm not sure. What undesired character could we have in $filename
> of a snapshot? The commit ab41dfbfd4f message gives this
> justification: "Just in case filename contains end of line character."
>
> It looks like $filename is constructed by well defined strings:
> basename($project), $hash and $suffix all of which should be ok.
>
> I'd say we don't need quotemeta for $filename of snapshot.
But we do need quoting for blob_plain and perhaps blobdiff_plain
views, although not quotemeta, but perhaps the reverse of unescape,
i.e. quote '"', EOLN (end of line) and perhaps also TAB.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-10-02 0:28 ` Jakub Narebski
@ 2006-10-02 20:12 ` Luben Tuikov
2006-10-02 20:50 ` Jakub Narebski
2006-10-06 13:50 ` Petr Baudis
0 siblings, 2 replies; 16+ messages in thread
From: Luben Tuikov @ 2006-10-02 20:12 UTC (permalink / raw)
To: Jakub Narebski, git
--- Jakub Narebski <jnareb@gmail.com> wrote:
> Luben Tuikov wrote:
>
> > --- Junio C Hamano <junkio@cox.net> wrote:
> >>
> >> Ouch, that was a sloppy planning and coding, and sloppier
> >> reviewing. Sorry.
> >>
> >> What is the right quoting there? Just quoting double-quotes?
> >
> > I'm not sure. What undesired character could we have in $filename
> > of a snapshot? The commit ab41dfbfd4f message gives this
> > justification: "Just in case filename contains end of line character."
> >
> > It looks like $filename is constructed by well defined strings:
> > basename($project), $hash and $suffix all of which should be ok.
> >
> > I'd say we don't need quotemeta for $filename of snapshot.
>
> But we do need quoting for blob_plain and perhaps blobdiff_plain
> views, although not quotemeta, but perhaps the reverse of unescape,
> i.e. quote '"', EOLN (end of line) and perhaps also TAB.
Escaping for the purposes of HTML _view_ and URL generation is ok,
but it is not ok when _saving_ the file with a file name.
A file name is just a string of chars, and I want to _save_ the file
name as its name is. No changes or interpretations please. I don't
care what the string is, what chars it is made of, etc.
Please don't interpret file names and their characters when the files
are _saved_ by the user's browser.
The file name in my filesystem should be the exact same file name
as it appears on any other filesystem hosting the same git repo.
I don't want this translation:
Server FS: linux-2.6.git-5c2d97cb31fb77981797fec46230ca005b865799.tar.gz
Quotemeta: linux\-2\.6\.git\-5c2d97cb31fb77981797fec46230ca005b865799\.tar\.gz
User FS: linux\-2\.6\.git\-5c2d97cb31fb77981797fec46230ca005b865799\.tar\.gz
When you comitted ab41dfbfd4f3f9fedac71550027e9813b11abe3d, it extended
quotemeta to where it shouldn't have been applied.
Luben
P.S. When replying please don't redact the CC field.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-10-02 20:12 ` Luben Tuikov
@ 2006-10-02 20:50 ` Jakub Narebski
2006-10-03 6:30 ` Junio C Hamano
2006-10-06 13:50 ` Petr Baudis
1 sibling, 1 reply; 16+ messages in thread
From: Jakub Narebski @ 2006-10-02 20:50 UTC (permalink / raw)
To: Luben Tuikov, Junio Hamano; +Cc: git
Luben Tuikov wrote:
> --- Jakub Narebski <jnareb@gmail.com> wrote:
>> Luben Tuikov wrote:
>>
>>> --- Junio C Hamano <junkio@cox.net> wrote:
>>>>
>>>> Ouch, that was a sloppy planning and coding, and sloppier
>>>> reviewing. Sorry.
>>>>
>>>> What is the right quoting there? Just quoting double-quotes?
>>>
>>> I'm not sure. What undesired character could we have in $filename
>>> of a snapshot? The commit ab41dfbfd4f message gives this
>>> justification: "Just in case filename contains end of line character."
>>>
>>> It looks like $filename is constructed by well defined strings:
>>> basename($project), $hash and $suffix all of which should be ok.
>>>
>>> I'd say we don't need quotemeta for $filename of snapshot.
>>
>> But we do need quoting for blob_plain and perhaps blobdiff_plain
>> views, although not quotemeta, but perhaps the reverse of unescape,
>> i.e. quote '"', EOLN (end of line) and perhaps also TAB.
>
> Escaping for the purposes of HTML _view_ and URL generation is ok,
> but it is not ok when _saving_ the file with a file name.
>
> A file name is just a string of chars, and I want to _save_ the file
> name as its name is. No changes or interpretations please. I don't
> care what the string is, what chars it is made of, etc.
We need to _escape_ HTML output (using esc_html subroutine) because
HTML treats some characters specially, and we need to escape them
to turn off this interpretation. Examples include SPC (for example
file name which has two consecutive spaces) and ampersand '&' which
might be treated as entity reference.
Sometimes we want to quote filename for view, for example to fit it
in one line, and to distinguish between tab and spaces.
But you forget that in HTTP headers, to be more exact in
Content-Disposition: inline; filename="<filename>"
header, the quote '"' and end-of-line '\n' characters in <filename>
are treated specially. So you need to quote somehow at least those
two characters.
<checks the RFC>
RFC 2616 "Hypertext Transfer Protocol -- HTTP/1.1":
RFC 1806 [35], from which the often implemented Content-Disposition
(see section 19.5.1) header in HTTP is derived, has a number of very
serious security considerations. Content-Disposition is not part of
the HTTP standard, but since it is widely implemented, we are
documenting its use and risks for implementors. See RFC 2183 [49]
(which updates RFC 1806) for details.
RFC 2183 "Communicating Presentation Information in Internet Messages:
The Content-Disposition Header Field":
2.3 The Filename Parameter
The sender may want to suggest a filename to be used if the entity is
detached and stored in a separate file. If the receiving MUA writes
the entity to a file, the suggested filename should be used as a
basis for the actual filename, where possible.
It is important that the receiving MUA not blindly use the suggested
filename. The suggested filename SHOULD be checked (and possibly
changed) to see that it conforms to local filesystem conventions,
does not overwrite an existing file, and does not present a security
problem (see Security Considerations below).
The receiving MUA SHOULD NOT respect any directory path information
that may seem to be present in the filename parameter. The filename
should be treated as a terminal component only. Portable
specification of directory paths might possibly be done in the future
via a separate Content-Disposition parameter, but no provision is
made for it in this draft.
Current [RFC 2045] grammar restricts parameter values (and hence
Content-Disposition filenames) to US-ASCII. We recognize the great
desirability of allowing arbitrary character sets in filenames, but
it is beyond the scope of this document to define the necessary
mechanisms. We expect that the basic [RFC 1521] `value'
specification will someday be amended to allow use of non-US-ASCII
characters, at which time the same mechanism should be used in the
Content-Disposition filename parameter.
> Please don't interpret file names and their characters when the files
> are _saved_ by the user's browser.
>
> The file name in my filesystem should be the exact same file name
> as it appears on any other filesystem hosting the same git repo.
As you can see from the above RFC, there is no standard way to pass
suggested filename to be exactly the same as in git repository, if
it contains non US-ASCII characters. At least not for _HTTP_
Content-Disposition: header; for email we can use quoted-printable
from RFC 2047 to encode non US-ASCII characters (including " and EOLN)
with good probability of this working correctly.
> I don't want this translation:
> Server FS: linux-2.6.git-5c2d97cb31fb77981797fec46230ca005b865799.tar.gz
> Quotemeta: linux\-2\.6\.git\-5c2d97cb31fb77981797fec46230ca005b865799\.tar\.gz
> User FS: linux\-2\.6\.git\-5c2d97cb31fb77981797fec46230ca005b865799\.tar\.gz
>
> When you comitted ab41dfbfd4f3f9fedac71550027e9813b11abe3d, it extended
> quotemeta to where it shouldn't have been applied.
I only followed (I admit blindly, without checking the details, and
I agree wrongly) the commit a2f3db2f5de2a3667b0e038aa65e3e097e642e7d
"gitweb: Consolidate escaping/validation of query string"
by Petr "Pasky" Baudis, which among others included the following patch
(which part of commit is not documented enough in commit message):
@@ -3126,7 +3116,7 @@ sub git_blobdiff {
-type => 'text/plain',
-charset => 'utf-8',
-expires => $expires,
- -content_disposition => qq(inline; filename="${file_name}.patch"));
+ -content_disposition => qq(inline; filename=") . quotemeta($file_name) . qq(.patch"));
print "X-Git-Url: " . $cgi->self_url() . "\n\n";
And I agree that your
"gitweb: Don't use quotemeta on internally generated strings"
is (partially) correct.
> Luben
> P.S. When replying please don't redact the CC field.
When I reply by email, I have no problem (I have to add git
mailing list if I forget to use reply to all). When I reply
via GMane NNTP interface, KNode by default replies only to
newsgroup (coupled with git mailing list).
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-10-02 20:50 ` Jakub Narebski
@ 2006-10-03 6:30 ` Junio C Hamano
2006-10-06 12:38 ` Jakub Narebski
0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2006-10-03 6:30 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Jakub Narebski <jnareb@gmail.com> writes:
> But you forget that in HTTP headers, to be more exact in
> Content-Disposition: inline; filename="<filename>"
> header, the quote '"' and end-of-line '\n' characters in <filename>
> are treated specially. So you need to quote somehow at least those
> two characters.
True, but untrue. This is just a suggestion so we do not _have_
to quote. We only need to avoid spitting out dq and lf
literally. We could even just do something like the attached if
we wanted to:
s/[^ -~]+/?/g ;# replace each sequence of bytes outside
# ' ' to '~' range to a '?'
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 44991b1..e7202ee 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2651,7 +2651,7 @@ sub git_blob_plain {
# save as filename, even when no $file_name is given
my $save_as = "$hash";
if (defined $file_name) {
- $save_as = $file_name;
+ ($save_as = $file_name) =~ s/[^ -~]+/?/g;
} elsif ($type =~ m/^text\//) {
$save_as .= '.txt';
}
@@ -2830,6 +2830,7 @@ sub git_snapshot {
}
my $filename = basename($project) . "-$hash.tar.$suffix";
+ $filename =~ s/[^ -~]+/?/g;
print $cgi->header(
-type => 'application/x-tar',
@@ -3139,6 +3140,7 @@ sub git_blobdiff {
}
} elsif ($format eq 'plain') {
+ $file_name =~ s/[^ -~]+/?/g;
print $cgi->header(
-type => 'text/plain',
-charset => 'utf-8',
@@ -3241,6 +3243,7 @@ sub git_commitdiff {
my $refs = git_get_references("tags");
my $tagname = git_get_rev_name_tags($hash);
my $filename = basename($project) . "-$hash.patch";
+ $filename =~ s/[^ -~]+/?/g;
print $cgi->header(
-type => 'text/plain',
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-10-03 6:30 ` Junio C Hamano
@ 2006-10-06 12:38 ` Jakub Narebski
2006-10-07 5:08 ` Junio C Hamano
0 siblings, 1 reply; 16+ messages in thread
From: Jakub Narebski @ 2006-10-06 12:38 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
>
>> But you forget that in HTTP headers, to be more exact in
>> Content-Disposition: inline; filename="<filename>"
>> header, the quote '"' and end-of-line '\n' characters in <filename>
>> are treated specially. So you need to quote somehow at least those
>> two characters.
>
> True, but untrue. This is just a suggestion so we do not _have_
> to quote. We only need to avoid spitting out dq and lf
> literally. We could even just do something like the attached if
> we wanted to:
>
> s/[^ -~]+/?/g ;# replace each sequence of bytes outside
> # ' ' to '~' range to a '?'
>
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 44991b1..e7202ee 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -2651,7 +2651,7 @@ sub git_blob_plain {
> # save as filename, even when no $file_name is given
> my $save_as = "$hash";
> if (defined $file_name) {
> - $save_as = $file_name;
> + ($save_as = $file_name) =~ s/[^ -~]+/?/g;
> } elsif ($type =~ m/^text\//) {
> $save_as .= '.txt';
> }
I'd rather add (and use) separate subroutine for quoting/escaping
values in HTTP headers, or to be more exact for the filename part
of HTTP header "Content-Disposition:". This way if we decide to
not replace all characters outside US-ASCII in suggested filename
to save with '?', but only qoublequote '"' and linefeed '\n' characters,
or even implement RFC 2047 to do the encoding (of course if browsers
can read it), we could do this in one place.
How such a subroutine should be named? esc_http? esc_header or esc_hdr?
esc_http_header? Any other ideas?
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-10-02 20:12 ` Luben Tuikov
2006-10-02 20:50 ` Jakub Narebski
@ 2006-10-06 13:50 ` Petr Baudis
2006-10-06 18:21 ` Luben Tuikov
1 sibling, 1 reply; 16+ messages in thread
From: Petr Baudis @ 2006-10-06 13:50 UTC (permalink / raw)
To: Luben Tuikov; +Cc: Jakub Narebski, git
Dear diary, on Mon, Oct 02, 2006 at 10:12:56PM CEST, I got a letter
where Luben Tuikov <ltuikov@yahoo.com> said that...
> Escaping for the purposes of HTML _view_ and URL generation is ok,
> but it is not ok when _saving_ the file with a file name.
>
> A file name is just a string of chars, and I want to _save_ the file
> name as its name is. No changes or interpretations please. I don't
> care what the string is, what chars it is made of, etc.
>
> Please don't interpret file names and their characters when the files
> are _saved_ by the user's browser.
>
> The file name in my filesystem should be the exact same file name
> as it appears on any other filesystem hosting the same git repo.
>
> I don't want this translation:
> Server FS: linux-2.6.git-5c2d97cb31fb77981797fec46230ca005b865799.tar.gz
> Quotemeta: linux\-2\.6\.git\-5c2d97cb31fb77981797fec46230ca005b865799\.tar\.gz
> User FS: linux\-2\.6\.git\-5c2d97cb31fb77981797fec46230ca005b865799\.tar\.gz
Then the user agent is buggy - which browser exhibits this behaviour?
According to RFC2183, the filename is a value. According to RFC2045, a
value is either a token (uninteresting) or a quoted-string. According to
RFC822:
quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or
; quoted chars.
qtext = <any CHAR excepting <">, ; => may be folded
"\" & CR, and including
linear-white-space>
quoted-pair = "\" CHAR ; may quote any char
So what we emit is completely correct.
Of course I have nothing against escaping just ", \ and CR. I don't mind
just substituting CR with some other string, but please just quote " and
\ correctly.
As of now, gitweb will not handle any filenames containing those three
characters properly because of now.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-10-06 13:50 ` Petr Baudis
@ 2006-10-06 18:21 ` Luben Tuikov
2006-10-06 18:45 ` Petr Baudis
0 siblings, 1 reply; 16+ messages in thread
From: Luben Tuikov @ 2006-10-06 18:21 UTC (permalink / raw)
To: Petr Baudis; +Cc: Jakub Narebski, git
--- Petr Baudis <pasky@suse.cz> wrote:
> Dear diary, on Mon, Oct 02, 2006 at 10:12:56PM CEST, I got a letter
> where Luben Tuikov <ltuikov@yahoo.com> said that...
> > Escaping for the purposes of HTML _view_ and URL generation is ok,
> > but it is not ok when _saving_ the file with a file name.
> >
> > A file name is just a string of chars, and I want to _save_ the file
> > name as its name is. No changes or interpretations please. I don't
> > care what the string is, what chars it is made of, etc.
> >
> > Please don't interpret file names and their characters when the files
> > are _saved_ by the user's browser.
> >
> > The file name in my filesystem should be the exact same file name
> > as it appears on any other filesystem hosting the same git repo.
> >
> > I don't want this translation:
> > Server FS: linux-2.6.git-5c2d97cb31fb77981797fec46230ca005b865799.tar.gz
> > Quotemeta: linux\-2\.6\.git\-5c2d97cb31fb77981797fec46230ca005b865799\.tar\.gz
> > User FS: linux\-2\.6\.git\-5c2d97cb31fb77981797fec46230ca005b865799\.tar\.gz
>
> Then the user agent is buggy - which browser exhibits this behaviour?
Latest greatest Firefox for Linux. And no, I don't think that
the browser is broken.
> According to RFC2183, the filename is a value. According to RFC2045, a
> value is either a token (uninteresting) or a quoted-string. According to
> RFC822:
>
> quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or
> ; quoted chars.
>
> qtext = <any CHAR excepting <">, ; => may be folded
> "\" & CR, and including
> linear-white-space>
>
> quoted-pair = "\" CHAR ; may quote any char
>
> So what we emit is completely correct.
(Your quotations do not seem correct according to
ftp://ftp.rfc-editor.org/in-notes/rfc2045.txt !)
Petr, I agree with your that what we emit is "completely correct".
But is is _mangled_. I.e. why mangle the filename from "a.b" to
"a\.b" ? Indeed the latter _is_ qtext but it is not the original name
given to the file.
What actually happened is that _gitweb_ itself mangles the name.
> Of course I have nothing against escaping just ", \ and CR. I don't mind
> just substituting CR with some other string, but please just quote " and
> \ correctly.
Indeed, these three are the only chars NOT ALLOWED in qtext.
Will quote those. Thanks for pointing this out.
> As of now, gitweb will not handle any filenames containing those three
> characters properly because of now.
Will fix.
Thanks,
Luben
>
> --
> Petr "Pasky" Baudis
> Stuff: http://pasky.or.cz/
> #!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
> $/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
> lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
@ 2006-10-06 18:24 Luben Tuikov
0 siblings, 0 replies; 16+ messages in thread
From: Luben Tuikov @ 2006-10-06 18:24 UTC (permalink / raw)
To: Petr Baudis; +Cc: Jakub Narebski, git
--- Luben Tuikov <ltuikov@yahoo.com> wrote:
> --- Petr Baudis <pasky@suse.cz> wrote:
> > Dear diary, on Mon, Oct 02, 2006 at 10:12:56PM CEST, I got a letter
> > where Luben Tuikov <ltuikov@yahoo.com> said that...
> > > Escaping for the purposes of HTML _view_ and URL generation is ok,
> > > but it is not ok when _saving_ the file with a file name.
> > >
> > > A file name is just a string of chars, and I want to _save_ the file
> > > name as its name is. No changes or interpretations please. I don't
> > > care what the string is, what chars it is made of, etc.
> > >
> > > Please don't interpret file names and their characters when the files
> > > are _saved_ by the user's browser.
> > >
> > > The file name in my filesystem should be the exact same file name
> > > as it appears on any other filesystem hosting the same git repo.
> > >
> > > I don't want this translation:
> > > Server FS: linux-2.6.git-5c2d97cb31fb77981797fec46230ca005b865799.tar.gz
> > > Quotemeta: linux\-2\.6\.git\-5c2d97cb31fb77981797fec46230ca005b865799\.tar\.gz
> > > User FS: linux\-2\.6\.git\-5c2d97cb31fb77981797fec46230ca005b865799\.tar\.gz
> >
> > Then the user agent is buggy - which browser exhibits this behaviour?
>
> Latest greatest Firefox for Linux. And no, I don't think that
> the browser is broken.
>
> > According to RFC2183, the filename is a value. According to RFC2045, a
> > value is either a token (uninteresting) or a quoted-string. According to
> > RFC822:
> >
> > quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or
> > ; quoted chars.
> >
> > qtext = <any CHAR excepting <">, ; => may be folded
> > "\" & CR, and including
> > linear-white-space>
> >
> > quoted-pair = "\" CHAR ; may quote any char
> >
> > So what we emit is completely correct.
>
> (Your quotations do not seem correct according to
> ftp://ftp.rfc-editor.org/in-notes/rfc2045.txt !)
>
> Petr, I agree with your that what we emit is "completely correct".
>
> But is is _mangled_. I.e. why mangle the filename from "a.b" to
> "a\.b" ? Indeed the latter _is_ qtext but it is not the original name
> given to the file.
Sorry, I ment to say that the latter doesn't appear to be qtext.
Bottomline is that quotemeta does not convert into qtext, and as thus
should never have been used.
Will fix as per your suggestions.
Luben
>
> What actually happened is that _gitweb_ itself mangles the name.
>
> > Of course I have nothing against escaping just ", \ and CR. I don't mind
> > just substituting CR with some other string, but please just quote " and
> > \ correctly.
>
> Indeed, these three are the only chars NOT ALLOWED in qtext.
> Will quote those. Thanks for pointing this out.
>
> > As of now, gitweb will not handle any filenames containing those three
> > characters properly because of now.
>
> Will fix.
>
> Thanks,
> Luben
>
>
>
> >
> > --
> > Petr "Pasky" Baudis
> > Stuff: http://pasky.or.cz/
> > #!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
> > $/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
> > lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
> >
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-10-06 18:21 ` Luben Tuikov
@ 2006-10-06 18:45 ` Petr Baudis
0 siblings, 0 replies; 16+ messages in thread
From: Petr Baudis @ 2006-10-06 18:45 UTC (permalink / raw)
To: Luben Tuikov; +Cc: Jakub Narebski, git
Dear diary, on Fri, Oct 06, 2006 at 08:21:05PM CEST, I got a letter
where Luben Tuikov <ltuikov@yahoo.com> said that...
> > According to RFC2183, the filename is a value. According to RFC2045, a
> > value is either a token (uninteresting) or a quoted-string. According to
> > RFC822:
> >
> > quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or
> > ; quoted chars.
> >
> > qtext = <any CHAR excepting <">, ; => may be folded
> > "\" & CR, and including
> > linear-white-space>
> >
> > quoted-pair = "\" CHAR ; may quote any char
> >
> > So what we emit is completely correct.
>
> (Your quotations do not seem correct according to
> ftp://ftp.rfc-editor.org/in-notes/rfc2045.txt !)
Wow, you caused my GNOME at work to do something totally horrible after
me clicking on that link... ;-)
I'm not sure how is RFC2045 relevant - this is from RFC822, RFC2045 does
not define those non-terminals.
> Petr, I agree with your that what we emit is "completely correct".
>
> But is is _mangled_. I.e. why mangle the filename from "a.b" to
> "a\.b" ? Indeed the latter _is_ qtext but it is not the original name
> given to the file.
..snip..
> Sorry, I ment to say that the latter doesn't appear to be qtext.
>
> Bottomline is that quotemeta does not convert into qtext, and as thus
> should never have been used.
It's a moot point now, but I don't see that - inside qtext, any
character can be quoted, so what we do is technically ok.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-10-06 12:38 ` Jakub Narebski
@ 2006-10-07 5:08 ` Junio C Hamano
2006-10-07 9:23 ` Jakub Narebski
0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2006-10-07 5:08 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Jakub Narebski <jnareb@gmail.com> writes:
> I'd rather add (and use) separate subroutine for quoting/escaping
> values in HTTP headers, or to be more exact for the filename part
> of HTTP header "Content-Disposition:". This way if we decide to
> not replace all characters outside US-ASCII in suggested filename
> to save with '?', but only qoublequote '"' and linefeed '\n' characters,
> or even implement RFC 2047 to do the encoding (of course if browsers
> can read it), we could do this in one place.
Sounds sane. quote_filename?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-10-07 5:08 ` Junio C Hamano
@ 2006-10-07 9:23 ` Jakub Narebski
2006-10-07 17:41 ` Luben Tuikov
0 siblings, 1 reply; 16+ messages in thread
From: Jakub Narebski @ 2006-10-07 9:23 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
>
>> I'd rather add (and use) separate subroutine for quoting/escaping
>> values in HTTP headers, or to be more exact for the filename part
>> of HTTP header "Content-Disposition:". This way if we decide to
>> not replace all characters outside US-ASCII in suggested filename
>> to save with '?', but only qoublequote '"' and linefeed '\n' characters,
>> or even implement RFC 2047 to do the encoding (of course if browsers
>> can read it), we could do this in one place.
>
> Sounds sane. quote_filename?
Luben Tuikov used to_qtext in
"[PATCH] gitweb: Convert Content-Disposition filenames into qtext"
Msg-ID: <20061006191801.68649.qmail@web31815.mail.mud.yahoo.com>
http://permalink.gmane.org/gmane.comp.version-control.git/28437
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-10-07 9:23 ` Jakub Narebski
@ 2006-10-07 17:41 ` Luben Tuikov
2006-10-11 9:48 ` Jakub Narebski
0 siblings, 1 reply; 16+ messages in thread
From: Luben Tuikov @ 2006-10-07 17:41 UTC (permalink / raw)
To: Jakub Narebski, Junio C Hamano; +Cc: git
--- Jakub Narebski <jnareb@gmail.com> wrote:
> Junio C Hamano wrote:
> > Jakub Narebski <jnareb@gmail.com> writes:
> >
> >> I'd rather add (and use) separate subroutine for quoting/escaping
> >> values in HTTP headers, or to be more exact for the filename part
> >> of HTTP header "Content-Disposition:". This way if we decide to
> >> not replace all characters outside US-ASCII in suggested filename
> >> to save with '?', but only qoublequote '"' and linefeed '\n' characters,
> >> or even implement RFC 2047 to do the encoding (of course if browsers
> >> can read it), we could do this in one place.
> >
> > Sounds sane. quote_filename?
>
> Luben Tuikov used to_qtext in
> "[PATCH] gitweb: Convert Content-Disposition filenames into qtext"
> Msg-ID: <20061006191801.68649.qmail@web31815.mail.mud.yahoo.com>
> http://permalink.gmane.org/gmane.comp.version-control.git/28437
I think that people familiar with the RFC will be able to quickly
recognize what this function does, after seeing "qtext" in the
name of the function. After all, not only filenames can be qtext.
Luben
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: gitweb: using quotemeta
2006-10-07 17:41 ` Luben Tuikov
@ 2006-10-11 9:48 ` Jakub Narebski
0 siblings, 0 replies; 16+ messages in thread
From: Jakub Narebski @ 2006-10-11 9:48 UTC (permalink / raw)
To: Luben Tuikov; +Cc: Junio C Hamano, git
Luben Tuikov wrote:
> > Luben Tuikov used to_qtext in
> > "[PATCH] gitweb: Convert Content-Disposition filenames into qtext"
> > Msg-ID: <20061006191801.68649.qmail@web31815.mail.mud.yahoo.com>
> > http://permalink.gmane.org/gmane.comp.version-control.git/28437
>
> I think that people familiar with the RFC will be able to quickly
> recognize what this function does, after seeing "qtext" in the
> name of the function. After all, not only filenames can be qtext.
It wasn't meant to criticize. Just pointing out. It is nice
naming scheme (to_qtext, to_utf8) in addition to esc_* naming scheme.
I had no good idea for esc_* naming scheme for to_qtext subroutine...
so to_qtext is better.
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2006-10-11 9:47 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-28 21:15 gitweb: using quotemeta Luben Tuikov
2006-09-28 23:18 ` Junio C Hamano
2006-09-28 23:27 ` Luben Tuikov
2006-10-02 0:28 ` Jakub Narebski
2006-10-02 20:12 ` Luben Tuikov
2006-10-02 20:50 ` Jakub Narebski
2006-10-03 6:30 ` Junio C Hamano
2006-10-06 12:38 ` Jakub Narebski
2006-10-07 5:08 ` Junio C Hamano
2006-10-07 9:23 ` Jakub Narebski
2006-10-07 17:41 ` Luben Tuikov
2006-10-11 9:48 ` Jakub Narebski
2006-10-06 13:50 ` Petr Baudis
2006-10-06 18:21 ` Luben Tuikov
2006-10-06 18:45 ` Petr Baudis
-- strict thread matches above, loose matches on Subject: below --
2006-10-06 18:24 Luben Tuikov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).