From: Jakub Narebski <jnareb@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
Jan Engelhardt <jengelh@medozas.de>,
Lea Wiemann <lewiemann@gmail.com>
Subject: [PATCH v2] gitweb: Add charset info to "raw" output of 'text/plain' blobs
Date: Tue, 3 Jun 2008 16:47:10 +0200 [thread overview]
Message-ID: <200806031647.10982.jnareb@gmail.com> (raw)
In-Reply-To: <200806011306.45945.jnareb@gmail.com>
Earlier "blob_plain" view sent "charset=utf-8" only when gitweb
guessed the content type to be text by reading from it, and not when
the MIME type was obtained from /etc/mime.types, or when gitweb
couldn't guess mimetype and used $default_blob_plain_mimetype.
This fixes the bug by always add charset info from
$default_text_plain_charset (if it is defined) to "raw" (a=blob_plain)
output for 'text/plain' blobs.
Generating information for Content-Type: header got separated into
blob_contenttype() subroutine; adding charset info in a special case
was removed from blob_mimetype(), which now should return mimetype
only.
While at it cleanup code a bit: put subroutine parameter
initialization first, make error message more robust (when $file_name
is not defined) if more cryptic, remove unnecessary '"' around
variable ("$var" -> $var).
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
On Sun, 1 June 2008, Jakub Narebski wrote:
> On Sat, 31 May 2008, Junio C Hamano wrote:
>> Jakub Narebski <jnareb@gmail.com> writes:
>>
>>> Always add charset info from $default_text_plain_charset (if it is
>>> defined) to "raw" (a=blob_plain) output for 'text/plain' blobs.
>>> Adding charset info in a special case was removed from blob_mimetype().
>>
>> Looks Ok but it took a bit of digging on the list for me to figure out
>> that something like this was missing from the beginning of your commit log
>> message:
>>
>> Earlier "blob_plain" view sent "charset=utf-8" only when gitweb
>> guessed the content type to be text by reading from it, and not
>> when the MIME type was obtained from /etc/mime.types.
>>
>> This fixes the bug by always adding....
>
> I'm sorry that I have forgot to put the "why" in commit message.
> I'd add this when resending v2 of this patch.
Added.
>> But I wonder if moving of this to the calling site is the right thing to
>> do. Wouldn't it become much more contained and robust if you did it this
>> way?
> [...]
>> sub blob_mimetype {
>
> This _might_ be better. I didn't do this for the following two reasons:
[...]
> So perhaps best of those ways would be to create thin wrapper subroutine,
> blob_contenttype($fd, $file_name, $mimetype), where both $file_name and
> (especially) $mimetype are optional parameters, and ise it in
> git_blob_plain() subroutine...
It is now done this way. IMHO it is a best solution.
>> + # Type specific postprocessing can be added as needed...
>> + if ($mime =~ /^text\//i &&
>> + $mime !~ /charset=/i && $default_text_plain_charset) {
>> + $mime .= '; charset='.$default_text_plain_charset;
>> + }
>> +
>> + return $mime;
>
> I'm not sure about it. I worry a bit about text/html, which can, and
> usually do, contain charset info inside the document. I'm not sure
> what happens when charset information from HTTP headers contradict
> charset information from presented file. That's why I have limited
> adding charset info purely to 'text/plain', not 'text/*' without
> charset info already present.
Currently for the above reason gitweb adds charset info _only_
for 'text/plain' mimetype.
gitweb/gitweb.perl | 29 ++++++++++++++++++++---------
1 files changed, 20 insertions(+), 9 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 57a1905..c6d43bf 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2481,8 +2481,7 @@ sub blob_mimetype {
return $default_blob_plain_mimetype unless $fd;
if (-T $fd) {
- return 'text/plain' .
- ($default_text_plain_charset ? '; charset='.$default_text_plain_charset : '');
+ return 'text/plain';
} elsif (! $filename) {
return 'application/octet-stream';
} elsif ($filename =~ m/\.png$/i) {
@@ -2496,6 +2495,17 @@ sub blob_mimetype {
}
}
+sub blob_contenttype {
+ my ($fd, $file_name, $type) = @_;
+
+ $type ||= blob_mimetype($fd, $file_name);
+ if ($type eq 'text/plain' && defined $default_text_plain_charset) {
+ $type .= "; charset=$default_text_plain_charset";
+ }
+
+ return $type;
+}
+
## ======================================================================
## functions printing HTML: header, footer, error page
@@ -4377,6 +4387,7 @@ sub git_heads {
}
sub git_blob_plain {
+ my $type = shift;
my $expires;
if (!defined $hash) {
@@ -4392,13 +4403,13 @@ sub git_blob_plain {
$expires = "+1d";
}
- my $type = shift;
open my $fd, "-|", git_cmd(), "cat-file", "blob", $hash
- or die_error(undef, "Couldn't cat $file_name, $hash");
+ or die_error(undef, "Open git-cat-file blob '$hash' failed");
- $type ||= blob_mimetype($fd, $file_name);
+ # content-type (can include charset)
+ $type = blob_contenttype($fd, $file_name, $type);
- # save as filename, even when no $file_name is given
+ # "save as" filename, even when no $file_name is given
my $save_as = "$hash";
if (defined $file_name) {
$save_as = $file_name;
@@ -4407,9 +4418,9 @@ sub git_blob_plain {
}
print $cgi->header(
- -type => "$type",
- -expires=>$expires,
- -content_disposition => 'inline; filename="' . "$save_as" . '"');
+ -type => $type,
+ -expires => $expires,
+ -content_disposition => 'inline; filename="' . $save_as . '"');
undef $/;
binmode STDOUT, ':raw';
print <$fd>;
--
1.5.5.3
next prev parent reply other threads:[~2008-06-03 14:48 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-28 18:04 gitweb forgets to send utf8 header for raw blob views Jan Engelhardt
2008-05-29 11:32 ` Lea Wiemann
2008-05-30 8:18 ` Jakub Narebski
2008-05-31 11:27 ` [PATCH] gitweb: Add charset info to "raw" blob output Jakub Narebski
2008-05-31 18:22 ` Junio C Hamano
2008-06-01 11:06 ` Jakub Narebski
2008-06-01 12:15 ` Jan Engelhardt
2008-06-01 12:16 ` Jan Engelhardt
2008-06-03 14:47 ` Jakub Narebski [this message]
2008-05-31 15:04 ` gitweb forgets to send utf8 header for raw blob views Jan Engelhardt
2008-05-31 22:39 ` Jakub Narebski
2008-06-01 2:08 ` Jan Engelhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200806031647.10982.jnareb@gmail.com \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jengelh@medozas.de \
--cc=lewiemann@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.