* [PATCH] gitweb: safely output binary files for 'blob_plain' action
@ 2006-06-17 11:32 Jakub Narebski
2006-06-17 15:35 ` Petr Baudis
0 siblings, 1 reply; 6+ messages in thread
From: Jakub Narebski @ 2006-06-17 11:32 UTC (permalink / raw)
To: git; +Cc: Kay Sievers, Junio C Hamano
gitweb tries now to output correct Content-Type header for
'blob_plain' action; for now text/plain for text files,
appropriate image MIME type for *.png, *.gif and *.jpg/*.jpeg files,
and application/octet-stream for other binary files.
Introduced new configuration variables: $default_blob_plain_mimetype
and $default_text_plain_charset (only 'utf-8' is guaranteed to work
for the latter).
binmode changed to ':raw' in git_blob_plain for output of non-text files.
---
gitweb/gitweb.cgi | 43 +++++++++++++++++++++++++++++++++++++++----
1 files changed, 39 insertions(+), 4 deletions(-)
diff --git a/gitweb/gitweb.cgi b/gitweb/gitweb.cgi
index 9d902b7..b37ec50 100755
--- a/gitweb/gitweb.cgi
+++ b/gitweb/gitweb.cgi
@@ -39,12 +39,16 @@ # html text to include at home page
my $home_text = "indextext.html";
# URI of default stylesheet
-my $stylesheet = "gitweb.css";
+my $stylesheet = "gitweb.css";
# source of projects list
#my $projects_list = $projectroot;
my $projects_list = "index/index.aux";
+# default blob_plain mimetype and default charset for text/plain blob
+my $default_blob_plain_mimetype = 'text/plain';
+my $default_text_plain_charset = 'utf-8'; # can be undefined
+
# input validation and dispatch
my $action = $cgi->param('a');
if (defined $action) {
@@ -1354,15 +1358,46 @@ sub git_blob {
git_footer_html();
}
+sub git_blob_plain_mimetype {
+ my $fd = shift;
+ my $filename = shift;
+
+ # just in case
+ return $default_blob_plain_mimetype unless $fd;
+
+ if (-T $fd) {
+ return 'text/plain' .
+ ($default_text_plain_charset ? '; charset='.$default_text_plain_charset : '');
+ } elsif (! $filename) {
+ return 'application/octet-stream';
+ } elsif ($filename =~ m/\.png$/i) {
+ return 'image/png';
+ } elsif ($filename =~ m/\.gif$/i) {
+ return 'image/gif';
+ } elsif ($filename =~ m/\.jpe?g$/i) {
+ return 'image/jpeg';
+ } else {
+ return 'application/octet-stream';
+ }
+}
+
sub git_blob_plain {
- my $save_as = "$hash.txt";
+ open my $fd, "-|", "$gitbin/git-cat-file blob $hash" or return;
+ my $type = git_blob_plain_mimetype($fd, $file_name);
+
+ # save as filename, even when no $file_name is given
+ my $save_as = "$hash";
if (defined $file_name) {
$save_as = $file_name;
+ } elsif ($type =~ m/^text\//) {
+ $save_as .= '.txt';
}
- print $cgi->header(-type => "text/plain", -charset => 'utf-8', '-content-disposition' => "inline; filename=\"$save_as\"");
- open my $fd, "-|", "$gitbin/git-cat-file blob $hash" or return;
+
+ print $cgi->header(-type => "$type", '-content-disposition' => "inline; filename=\"$save_as\"");
undef $/;
+ binmode STDOUT, ':raw' unless $type =~ m/^text\//;
print <$fd>;
+ binmode STDOUT, ':utf8' unless $type =~ m/^text\//;
$/ = "\n";
close $fd;
}
--
1.3.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] gitweb: safely output binary files for 'blob_plain' action
2006-06-17 11:32 [PATCH] gitweb: safely output binary files for 'blob_plain' action Jakub Narebski
@ 2006-06-17 15:35 ` Petr Baudis
2006-06-17 16:07 ` [PATCH] gitweb: text files for 'blob_plain' action without charset by default Jakub Narebski
2006-06-17 21:13 ` [PATCH] gitweb: safely output binary files for 'blob_plain' action Junio C Hamano
0 siblings, 2 replies; 6+ messages in thread
From: Petr Baudis @ 2006-06-17 15:35 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git, Kay Sievers, Junio C Hamano
Dear diary, on Sat, Jun 17, 2006 at 01:32:15PM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> Introduced new configuration variables: $default_blob_plain_mimetype
> and $default_text_plain_charset (only 'utf-8' is guaranteed to work
> for the latter).
Nah, defaulting to 'utf-8' is horrible - usually, you just don't have a
clue and should refrain from sending any charset information at all, so
I think undef is a much saner default.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
A person is just about as big as the things that make them angry.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] gitweb: text files for 'blob_plain' action without charset by default
2006-06-17 15:35 ` Petr Baudis
@ 2006-06-17 16:07 ` Jakub Narebski
2006-06-17 21:13 ` [PATCH] gitweb: safely output binary files for 'blob_plain' action Junio C Hamano
1 sibling, 0 replies; 6+ messages in thread
From: Jakub Narebski @ 2006-06-17 16:07 UTC (permalink / raw)
To: git
$default_text_plain_charset is undefined (no specified charset) by
default. Additionally ':raw' layer for binmode is used for outputting file
content.
---
This patch depends on the previous patch in the thread:
"gitweb: safely output binary files for 'blob_plain' action"
Contrary to the previous patch it is not based on other unrelated gitweb.cgi
patches (this changes only line numbers in patch).
gitweb/gitweb.cgi | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
53209981db06a5dde7c59caada279bf63d329da8
diff --git a/gitweb/gitweb.cgi b/gitweb/gitweb.cgi
index acac1f4..f082e5d 100755
--- a/gitweb/gitweb.cgi
+++ b/gitweb/gitweb.cgi
@@ -44,7 +44,7 @@ my $projects_list = "index/index.aux";
# default blob_plain mimetype and default charset for text/plain blob
my $default_blob_plain_mimetype = 'text/plain';
-my $default_text_plain_charset = 'utf-8'; # can be undefined
+my $default_text_plain_charset = undef; # was: 'utf-8'
# input validation and dispatch
my $action = $cgi->param('a');
@@ -1451,9 +1451,9 @@ sub git_blob_plain {
print $cgi->header(-type => "$type", '-content-disposition' => "inline; filename=\"$save_as\"");
undef $/;
- binmode STDOUT, ':raw' unless $type =~ m/^text\//;
+ binmode STDOUT, ':raw';
print <$fd>;
- binmode STDOUT, ':utf8' unless $type =~ m/^text\//;
+ binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
$/ = "\n";
close $fd;
}
--
1.3.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] gitweb: safely output binary files for 'blob_plain' action
2006-06-17 15:35 ` Petr Baudis
2006-06-17 16:07 ` [PATCH] gitweb: text files for 'blob_plain' action without charset by default Jakub Narebski
@ 2006-06-17 21:13 ` Junio C Hamano
2006-06-17 22:01 ` Petr Baudis
1 sibling, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2006-06-17 21:13 UTC (permalink / raw)
To: Petr Baudis; +Cc: Jakub Narebski, git, Kay Sievers
Petr Baudis <pasky@suse.cz> writes:
> Dear diary, on Sat, Jun 17, 2006 at 01:32:15PM CEST, I got a letter
> where Jakub Narebski <jnareb@gmail.com> said that...
>> Introduced new configuration variables: $default_blob_plain_mimetype
>> and $default_text_plain_charset (only 'utf-8' is guaranteed to work
>> for the latter).
>
> Nah, defaulting to 'utf-8' is horrible - usually, you just don't have a
> clue and should refrain from sending any charset information at all, so
> I think undef is a much saner default.
Concurred. I see Jakub's second patch to make this
configurable, but I wonder about a few things:
- we might want to have a configuration mechanism in place
before enhancing gitweb. My gut feeling is that we can use
[gitweb] section in project.git/config (and probably
duplicate first and deprecate later existing "description" as
well).
- the blob charset should be per path -- otherwise the feature
would be not useful for projects that maintains bunch of po
files.
In other words, something like this:
(in torvalds/linux-2.6.git/config)
[gitweb]
description = "Linus's kernel tree"
; defaultblobcharset = "latin1"
blobmimemapfile = "mime-map"
(in torvalds/linux-2.6.git/mime-map, first match decides)
fs/nls/nls_euc-jp.c text/plain; charset=euc_jp
*.c text/plain; charset=utf-8
*.h text/plain; charset=utf-8
I do not think defaultblobcharset above is a good idea though.
You could just have the last entry in mime-map file to be:
* text/plain; charset=latin1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] gitweb: safely output binary files for 'blob_plain' action
2006-06-17 21:13 ` [PATCH] gitweb: safely output binary files for 'blob_plain' action Junio C Hamano
@ 2006-06-17 22:01 ` Petr Baudis
2006-06-17 22:30 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Petr Baudis @ 2006-06-17 22:01 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jakub Narebski, git, Kay Sievers
Dear diary, on Sat, Jun 17, 2006 at 11:13:30PM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> - we might want to have a configuration mechanism in place
> before enhancing gitweb. My gut feeling is that we can use
> [gitweb] section in project.git/config (and probably
> duplicate first and deprecate later existing "description" as
> well).
Agreed. (I planned to back this up with a patch, then looked at the
clock.)
Hmm, after I'm over my exam period, since there's now another .pl thing
in the git tree I might start working on some kind of universal Git.pm
interface. I'm gonna need it for Cogito in the longer term anyway. ;-)
> - the blob charset should be per path -- otherwise the feature
> would be not useful for projects that maintains bunch of po
> files.
>
> In other words, something like this:
>
> (in torvalds/linux-2.6.git/config)
>
> [gitweb]
> description = "Linus's kernel tree"
> ; defaultblobcharset = "latin1"
> blobmimemapfile = "mime-map"
>
> (in torvalds/linux-2.6.git/mime-map, first match decides)
>
> fs/nls/nls_euc-jp.c text/plain; charset=euc_jp
> *.c text/plain; charset=utf-8
> *.h text/plain; charset=utf-8
You could as well just support the mime.types format and load
/etc/mime.types for this kind of mapping (see below for a patch). The
advantage is that this pretty much covers all the MIME types you will
need, the disadvantage is that it's less flexible and the charset part
wouldn't probably fit in nicely.
We could obviously do both. :-)
---
[PATCH] Support for the standard mime.types map in gitweb
gitweb will try to look up the filename mimetype in /etc/mime.types
and optionally a user-configured mime.types map as well.
Signed-off-by: Petr Baudis <pasky@suse.cz>
---
Depends on Jakub's mime patches.
gitweb/gitweb.cgi | 44 ++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 44 insertions(+), 0 deletions(-)
diff --git a/gitweb/gitweb.cgi b/gitweb/gitweb.cgi
index 9250548..0116531 100755
--- a/gitweb/gitweb.cgi
+++ b/gitweb/gitweb.cgi
@@ -46,6 +46,11 @@ # default blob_plain mimetype and defaul
my $default_blob_plain_mimetype = 'text/plain';
my $default_text_plain_charset = undef; # was: 'utf-8'
+# file to use for guessing MIME types before trying /etc/mime.types
+# (relative to the current git repository)
+my $mimetypes_file = undef;
+
+
# input validation and dispatch
my $action = $cgi->param('a');
if (defined $action) {
@@ -1414,6 +1419,40 @@ sub git_blob {
git_footer_html();
}
+sub mimetype_guess_file {
+ my $filename = shift;
+ my $mimemap = shift;
+ -r $mimemap or return undef;
+
+ my %mimemap;
+ open(MIME, $mimemap) or return undef;
+ while (<MIME>) {
+ my ($mime, $exts) = split(/\t+/);
+ my @exts = split(/\s+/, $exts);
+ foreach my $ext (@exts) {
+ $mimemap{$ext} = $mime;
+ }
+ }
+ close(MIME);
+
+ $filename =~ /\.(.*?)$/;
+ return $mimemap{$1};
+}
+
+sub mimetype_guess {
+ my $filename = shift;
+ my $mime;
+ $filename =~ /\./ or return undef;
+
+ if ($mimetypes_file) {
+ my $file = $mimetypes_file;
+ $file =~ m#^/# or $file = "$projectroot/$path/$file";
+ $mime = mimetype_guess_file($filename, $file);
+ }
+ $mime ||= mimetype_guess_file($filename, '/etc/mime.types');
+ return $mime;
+}
+
sub git_blob_plain_mimetype {
my $fd = shift;
my $filename = shift;
@@ -1421,6 +1460,11 @@ sub git_blob_plain_mimetype {
# just in case
return $default_blob_plain_mimetype unless $fd;
+ if ($filename) {
+ my $mime = mimetype_guess($filename);
+ $mime and return $mime;
+ }
+
if (-T $fd) {
return 'text/plain' .
($default_text_plain_charset ? '; charset='.$default_text_plain_charset : '');
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
A person is just about as big as the things that make them angry.
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] gitweb: safely output binary files for 'blob_plain' action
2006-06-17 22:01 ` Petr Baudis
@ 2006-06-17 22:30 ` Junio C Hamano
0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2006-06-17 22:30 UTC (permalink / raw)
To: Petr Baudis; +Cc: Jakub Narebski, git, Kay Sievers
Petr Baudis <pasky@suse.cz> writes:
>> In other words, something like this:
>>
>> (in torvalds/linux-2.6.git/config)
>>
>> [gitweb]
>> description = "Linus's kernel tree"
>> ; defaultblobcharset = "latin1"
>> blobmimemapfile = "mime-map"
>>
>> (in torvalds/linux-2.6.git/mime-map, first match decides)
>>
>> fs/nls/nls_euc-jp.c text/plain; charset=euc_jp
>> *.c text/plain; charset=utf-8
>> *.h text/plain; charset=utf-8
>
> You could as well just support the mime.types format and load
> /etc/mime.types for this kind of mapping (see below for a patch). The
> advantage is that this pretty much covers all the MIME types you will
> need, the disadvantage is that it's less flexible and the charset part
> wouldn't probably fit in nicely.
Ah, I thought Jakub's patch was already taking care of
mime.types but apparently that was not the case. As you say,
using /etc/mime.types for this is obviously a good point to
start.
> We could obviously do both. :-)
The point of my example was about charset part; comparing the
suffix part only is not good enough for .po files, so we should
obviously do both.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-06-17 22:31 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-17 11:32 [PATCH] gitweb: safely output binary files for 'blob_plain' action Jakub Narebski
2006-06-17 15:35 ` Petr Baudis
2006-06-17 16:07 ` [PATCH] gitweb: text files for 'blob_plain' action without charset by default Jakub Narebski
2006-06-17 21:13 ` [PATCH] gitweb: safely output binary files for 'blob_plain' action Junio C Hamano
2006-06-17 22:01 ` Petr Baudis
2006-06-17 22:30 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).