* [PATCH] gitweb: use correct mime type even if filename has multiple dots.
@ 2006-09-16 21:09 Martin Waitz
2006-09-16 21:44 ` Jakub Narebski
0 siblings, 1 reply; 10+ messages in thread
From: Martin Waitz @ 2006-09-16 21:09 UTC (permalink / raw)
To: git
Match the last part of the filename agains the extention from the
mime database instead of insisting that it starts at the first dot.
Signed-off-by: Martin Waitz <tali@admingilde.org>
---
gitweb/gitweb.perl | 9 +++++----
1 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index ee561c6..7501251 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1106,7 +1106,6 @@ sub mimetype_guess_file {
my $mimemap = shift;
-r $mimemap or return undef;
- my %mimemap;
open(MIME, $mimemap) or return undef;
while (<MIME>) {
next if m/^#/; # skip comments
@@ -1114,14 +1113,16 @@ sub mimetype_guess_file {
if (defined $exts) {
my @exts = split(/\s+/, $exts);
foreach my $ext (@exts) {
- $mimemap{$ext} = $mime;
+ if ($filename =~ /\.$ext$/) {
+ close(MIME);
+ return $mime;
+ }
}
}
}
close(MIME);
- $filename =~ /\.(.*?)$/;
- return $mimemap{$1};
+ return undef;
}
sub mimetype_guess {
--
1.4.2.gb8b6b
--
Martin Waitz
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] gitweb: use correct mime type even if filename has multiple dots.
2006-09-16 21:09 [PATCH] gitweb: use correct mime type even if filename has multiple dots Martin Waitz
@ 2006-09-16 21:44 ` Jakub Narebski
2006-09-17 7:51 ` Martin Waitz
0 siblings, 1 reply; 10+ messages in thread
From: Jakub Narebski @ 2006-09-16 21:44 UTC (permalink / raw)
To: git
Martin Waitz wrote:
> Match the last part of the filename agains the extention from the
> mime database instead of insisting that it starts at the first dot.
[...]
> - $filename =~ /\.(.*?)$/;
> - return $mimemap{$1};
Actually, that is non-greedy match, so the above code insist that
extension starts at the _last_ dot.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] gitweb: use correct mime type even if filename has multiple dots.
2006-09-16 21:44 ` Jakub Narebski
@ 2006-09-17 7:51 ` Martin Waitz
2006-09-17 8:23 ` Jakub Narebski
2006-09-17 8:41 ` [PATCH] gitweb: use correct mime type even if filename has multiple dots Junio C Hamano
0 siblings, 2 replies; 10+ messages in thread
From: Martin Waitz @ 2006-09-17 7:51 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 1051 bytes --]
hoi :)
On Sat, Sep 16, 2006 at 11:44:52PM +0200, Jakub Narebski wrote:
> Martin Waitz wrote:
>
> > Match the last part of the filename agains the extention from the
> > mime database instead of insisting that it starts at the first dot.
> [...]
> > - $filename =~ /\.(.*?)$/;
> > - return $mimemap{$1};
>
> Actually, that is non-greedy match, so the above code insist that
> extension starts at the _last_ dot.
hmm, but it didn't work for me.
I had filenames like "man/program.8.html" which got served as
"text/html" with the old code.
Besides, the new code would cope with extentions that contain a
dot itself.
Looking at /etc/mime.types, it only contains pcf.Z but perhaps
it should also contain tar.gz or similiar.
>
> --
> Jakub Narebski
> Warsaw, Poland
> ShadeHawk on #git
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Martin Waitz
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] gitweb: use correct mime type even if filename has multiple dots.
2006-09-17 7:51 ` Martin Waitz
@ 2006-09-17 8:23 ` Jakub Narebski
2006-09-19 8:29 ` Martin Waitz
2006-09-17 8:41 ` [PATCH] gitweb: use correct mime type even if filename has multiple dots Junio C Hamano
1 sibling, 1 reply; 10+ messages in thread
From: Jakub Narebski @ 2006-09-17 8:23 UTC (permalink / raw)
To: git
Martin Waitz wrote:
> hoi :)
>
> On Sat, Sep 16, 2006 at 11:44:52PM +0200, Jakub Narebski wrote:
>> Martin Waitz wrote:
>>
>> > Match the last part of the filename agains the extention from the
>> > mime database instead of insisting that it starts at the first dot.
>> [...]
>> > - $filename =~ /\.(.*?)$/;
>> > - return $mimemap{$1};
>>
>> Actually, that is non-greedy match, so the above code insist that
>> extension starts at the _last_ dot.
>
> hmm, but it didn't work for me.
> I had filenames like "man/program.8.html" which got served as
> "text/html" with the old code.
And why it shouldn't? From the extension it is HTML page, I would guess
manpage converted to HTML (pretty-printed manpage). And it should be served
with text/html mimetype.
> Besides, the new code would cope with extentions that contain a
> dot itself.
But is unnecessary complicated, and I guess performance suffers a bit.
> Looking at /etc/mime.types, it only contains pcf.Z but perhaps
So the *.pcf.Z file wouldn't get correct mimetype. No big.
> it should also contain tar.gz or similiar.
You can't properly serve tar.gz as something other than with
application/x-gzip mimetype. If you want to serve it as/with
application/x-tar mimetype, you would need to add
Content-Encoding: x-gzip
in addition to
Content-Type: application/x-tar
And your code doesn't do that.
The _last_ extension defines the type.
Besides, with the exception of files which can be displayed in browser,
i.e. HTML files and images it doesn't matter what the mimetype is, if only
binary files get binary mimetype (e.g. generic application/octet-stream).
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] gitweb: use correct mime type even if filename has multiple dots.
2006-09-17 8:23 ` Jakub Narebski
@ 2006-09-19 8:29 ` Martin Waitz
2006-09-19 11:57 ` [PATCH] gitweb: Fix mimetype_guess_file for files with multiple extensions Jakub Narebski
0 siblings, 1 reply; 10+ messages in thread
From: Martin Waitz @ 2006-09-19 8:29 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 502 bytes --]
On Sun, Sep 17, 2006 at 10:23:45AM +0200, Jakub Narebski wrote:
> Martin Waitz wrote:
>
> > hoi :)
> >
> > hmm, but it didn't work for me.
> > I had filenames like "man/program.8.html" which got served as
> > "text/html" with the old code.
>
> And why it shouldn't? From the extension it is HTML page, I would guess
> manpage converted to HTML (pretty-printed manpage). And it should be served
> with text/html mimetype.
arg, typo, it got served as "text/plain".
--
Martin Waitz
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] gitweb: Fix mimetype_guess_file for files with multiple extensions
2006-09-19 8:29 ` Martin Waitz
@ 2006-09-19 11:57 ` Jakub Narebski
0 siblings, 0 replies; 10+ messages in thread
From: Jakub Narebski @ 2006-09-19 11:57 UTC (permalink / raw)
To: Martin Waitz; +Cc: git
Fix getting correct mimetype for "blob_plain" view for files which have
multiple extensions, e.g. foo.1.html; now only the last extension
is used to find mimetype.
Noticed by Martin Waitz.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
This is much simpler (and faster!) correction to the mentioned problem.
I just don't grok regular expressions, not completly.
gitweb/gitweb.perl | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 01fae94..034cdf1 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1199,7 +1199,7 @@ sub mimetype_guess_file {
}
close(MIME);
- $filename =~ /\.(.*?)$/;
+ $filename =~ /\.([^.]*)$/;
return $mimemap{$1};
}
--
1.4.2.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] gitweb: use correct mime type even if filename has multiple dots.
2006-09-17 7:51 ` Martin Waitz
2006-09-17 8:23 ` Jakub Narebski
@ 2006-09-17 8:41 ` Junio C Hamano
2006-09-17 8:56 ` Jakub Narebski
2006-09-19 9:23 ` Martin Waitz
1 sibling, 2 replies; 10+ messages in thread
From: Junio C Hamano @ 2006-09-17 8:41 UTC (permalink / raw)
To: Martin Waitz; +Cc: git, Jakub Narebski
Martin Waitz <tali@admingilde.org> writes:
>> > - $filename =~ /\.(.*?)$/;
>> > - return $mimemap{$1};
>>
>> Actually, that is non-greedy match, so the above code insist that
>> extension starts at the _last_ dot.
>
> hmm, but it didn't work for me.
> I had filenames like "man/program.8.html" which got served as
> "text/html" with the old code.
It based its decision on '.html' part, which is expected from
non-greedy match (if I were writing the pattern, I would have
written /\.[^.]+$/ instead, though). Are you trying to have it
behave differently between "x.8.html" and "x.html"?
> Looking at /etc/mime.types, it only contains pcf.Z but perhaps
> it should also contain tar.gz or similiar.
Probably. But that makes me think it might be better to:
- read in mime.types, sort the entries with length of the
suffixes (longer first);
- try matching the suffixes from longer to shorter and pick the
first match.
Without that, you would not be able to cope with a /etc/mime.types
that looks like this, no?
application/a a
application/b b.a
Perhaps something like the attached.
---
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index a81c8d4..4994a33 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1140,23 +1140,42 @@ sub mimetype_guess_file {
my $filename = shift;
my $mimemap = shift;
-r $mimemap or return undef;
+ local($_);
- my %mimemap;
- open(MIME, $mimemap) or return undef;
+ open MIME, $mimemap
+ or return undef;
+
+ # Under mod_perl caching this may make a lot of sense...
+ my @mime = ();
+ my $maxlen = 0;
while (<MIME>) {
- next if m/^#/; # skip comments
- my ($mime, $exts) = split(/\t+/);
- if (defined $exts) {
- my @exts = split(/\s+/, $exts);
- foreach my $ext (@exts) {
- $mimemap{$ext} = $mime;
+ next if /^#/;
+ chomp;
+ my ($mimetype, @ext) = split(/\s+/);
+ for (@ext) {
+ my $len = length;
+ my $map = $mime[$len];
+ if (!$map) {
+ $mime[$len] = $map = {};
+ $maxlen = $len if ($maxlen < $len);
}
+ # We could detect duplicate definition here... i.e.
+ # onetype ext
+ # anothertype ext
+ $map->{$_} = $mimetype;
}
}
- close(MIME);
+ close MIME;
- $filename =~ /\.(.*?)$/;
- return $mimemap{$1};
+ for ($filename) {
+ for (my $len = $maxlen; 0 < $len; $len--) {
+ my $map = $mime[$len];
+ while (my ($ext, $type) = each %$map) {
+ return $type if (/\.\Q$ext\E$/);
+ }
+ }
+ }
+ return undef;
}
sub mimetype_guess {
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] gitweb: use correct mime type even if filename has multiple dots.
2006-09-17 8:41 ` [PATCH] gitweb: use correct mime type even if filename has multiple dots Junio C Hamano
@ 2006-09-17 8:56 ` Jakub Narebski
2006-09-17 9:07 ` Junio C Hamano
2006-09-19 9:23 ` Martin Waitz
1 sibling, 1 reply; 10+ messages in thread
From: Jakub Narebski @ 2006-09-17 8:56 UTC (permalink / raw)
To: git
Junio C Hamano wrote:
>> Looking at /etc/mime.types, it only contains pcf.Z but perhaps
>> it should also contain tar.gz or similiar.
>
> Probably. But that makes me think it might be better to:
>
> - read in mime.types, sort the entries with length of the
> suffixes (longer first);
>
> - try matching the suffixes from longer to shorter and pick the
> first match.
>
> Without that, you would not be able to cope with a /etc/mime.types
> that looks like this, no?
>
> application/a a
> application/b b.a
>
> Perhaps something like the attached.
Is it really useful? Usually the suffix in mime.types doesn't contain dot
itself. Besides, to be said we need correct mimetype only for files which
can be displayed in browser (HTML, XHTML, images: png, gif, jpeg, perhaps
XML). All other can get generic mimetype, i.e. application/octet-stream for
binary files (to be saved) and text/plain for text file (to be displayer
as-is in browser).
Besides, performance will suffer for "blob_plain" view. One hash lookup
vs. nested loops.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] gitweb: use correct mime type even if filename has multiple dots.
2006-09-17 8:56 ` Jakub Narebski
@ 2006-09-17 9:07 ` Junio C Hamano
0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2006-09-17 9:07 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Jakub Narebski <jnareb@gmail.com> writes:
> Is it really useful? Usually the suffix in mime.types doesn't contain dot
> itself. Besides, to be said we need correct mimetype only for files which
> can be displayed in browser (HTML, XHTML, images: png, gif, jpeg, perhaps
> XML). All other can get generic mimetype, i.e. application/octet-stream for
> binary files (to be saved) and text/plain for text file (to be displayer
> as-is in browser).
Sorry, I do not think that is quite correct. The browser can
launch an appropriate application on the downloaded file, as
long as you send the data labelled with proper mimetype. If you
send out application/octet-stream, of course that would not work.
Having said that, I am not sure what special things Martin
wanted to do with x.8.html that is different from y.html, so
maybe discussing the patch does not have merit at all.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] gitweb: use correct mime type even if filename has multiple dots.
2006-09-17 8:41 ` [PATCH] gitweb: use correct mime type even if filename has multiple dots Junio C Hamano
2006-09-17 8:56 ` Jakub Narebski
@ 2006-09-19 9:23 ` Martin Waitz
1 sibling, 0 replies; 10+ messages in thread
From: Martin Waitz @ 2006-09-19 9:23 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jakub Narebski
[-- Attachment #1: Type: text/plain, Size: 515 bytes --]
hoi :)
On Sun, Sep 17, 2006 at 01:41:40AM -0700, Junio C Hamano wrote:
> - read in mime.types, sort the entries with length of the
> suffixes (longer first);
>
> - try matching the suffixes from longer to shorter and pick the
> first match.
>
> Without that, you would not be able to cope with a /etc/mime.types
> that looks like this, no?
>
> application/a a
> application/b b.a
>
> Perhaps something like the attached.
works perfectly, thanks.
--
Martin Waitz
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2006-09-19 11:56 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-16 21:09 [PATCH] gitweb: use correct mime type even if filename has multiple dots Martin Waitz
2006-09-16 21:44 ` Jakub Narebski
2006-09-17 7:51 ` Martin Waitz
2006-09-17 8:23 ` Jakub Narebski
2006-09-19 8:29 ` Martin Waitz
2006-09-19 11:57 ` [PATCH] gitweb: Fix mimetype_guess_file for files with multiple extensions Jakub Narebski
2006-09-17 8:41 ` [PATCH] gitweb: use correct mime type even if filename has multiple dots Junio C Hamano
2006-09-17 8:56 ` Jakub Narebski
2006-09-17 9:07 ` Junio C Hamano
2006-09-19 9:23 ` Martin Waitz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).