From: Jakub Narebski <jnareb@gmail.com>
To: Gerrit Pape <pape@smarden.org>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
"Recai Oktaş" <roktas@debian.org>
Subject: Re: [PATCH/rfc] gitweb: open files (e.g. indextext.html) in utf8 mode
Date: Wed, 02 Jul 2008 06:37:38 -0700 (PDT) [thread overview]
Message-ID: <m3prpwflus.fsf@localhost.localdomain> (raw)
In-Reply-To: <20080702121317.10819.qmail@bca5b84cb0e0a0.315fe32.mid.smarden.org>
Gerrit Pape <pape@smarden.org> writes:
> From: =?utf-8?q?Recai=20Okta=C5=9F?= <roktas@debian.org>
You don't need to use quoted-printable in 'From:' header embedded in
the mail body. It should probably read
From: "Recai Oktaş" <roktas@debian.org>
(provided that you can use utf-8 in email).
> gitweb used to use utf8 only in stdout. As a result, included files
> like indextext.html appeared garbled if they contain utf8 characters.
> Now utf8 is also used when reading files.
It would better read as:
Gitweb used to use utf8 mode only on STDOUT (actually ":utf8" output
layer), relying on using to_utf8(...) to convert input data from uft8
to Perl internal form. As a result, included files such as $home_text
(indextext.html in default build configuration), or repository's
README.html appeared garbled if they did contain UTF-8 characters.
Now uft8 mode is used for all open invovations, also when reading files.
> The patch was submitted through
> http://bugs.debian.org/487465
>
Probably should have here
Reported-by: Recai Oktaş <roktas@debian.org>
> Signed-off-by: Gerrit Pape <pape@smarden.org>
> ---
> gitweb/gitweb.perl | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 90cd99b..96cb4e0 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -16,7 +16,7 @@ use Encode;
> use Fcntl ':mode';
> use File::Find qw();
> use File::Basename qw(basename);
> -binmode STDOUT, ':utf8';
> +use open qw(:std :utf8);
>
> BEGIN {
> CGI->compile() if $ENV{'MOD_PERL'};
It would be wonderfull if such simple solution worked. We would be
then able to remove to_utf8() subroutine and do not worry that we
forgot to convert some string to Perl internal encoding, which could
result to curring wide (non US-ASCII) UTF-8 character to be cut in
half. (On the other hand we wouldn't have $fallback_encoding).
Unfortunately there are two problem (or rather a problem and a half)
with this approach.
First is that with this patch gitweb doesn't pass gitweb test
t/t9500-gitweb-standalone-no-errors.sh (this is with perl v5.8.6)
* ok 63: encode(commit): utf8
* ok 64: encode(commit): iso-8859-1
* ok 65: encode(log): utf-8 and iso-8859-1
[...]
* FAIL 71: URL: no project URLs, no base URL
gitweb_run "p=.git;a=summary"
[Wed Jul 2 13:10:15 2008] gitweb.perl: utf8 "\xC4" does not map to Unicode \
at /path/to/git/t/trash directory/../../gitweb/gitweb.perl line 2298, \
<$fd> line 1.
[Wed Jul 2 13:10:15 2008] gitweb.perl: Malformed UTF-8 character \
(unexpected end of string) at [...]/gitweb/gitweb.perl line 2303, \
<$fd> line 1.
which is
open my $fd, '-|', git_cmd(), 'for-each-ref',
($limit ? '--count='.($limit+1) : ()), '--sort=-committerdate',
'--format=%(objectname) %(refname) %(subject)%00%(committer)',
'refs/heads'
or return;
2298: while (my $line = <$fd>) {
my %ref_item;
chomp $line;
my ($refinfo, $committerinfo) = split(/\0/, $line);
2303: my ($hash, $name, $title) = split(' ', $refinfo, 3);
Second, what is minimal Perl version and Perl configuration (installed
modules) that support "use open qw(:std :utf8);"? We do have some
minimal requirements for gitweb, and it would be nice if we didn't add
to them. But we already require PerlIO, so it probably doesn't matter.
--
Jakub Narebski
Poland
ShadeHawk on #git
next prev parent reply other threads:[~2008-07-02 13:38 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-02 12:13 [PATCH/rfc] gitweb: open files (e.g. indextext.html) in utf8 mode Gerrit Pape
2008-07-02 13:37 ` Jakub Narebski [this message]
2008-07-03 9:39 ` Lea Wiemann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m3prpwflus.fsf@localhost.localdomain \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pape@smarden.org \
--cc=roktas@debian.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).