From: Jakub Narebski <jnareb@gmail.com>
To: Gerrit Pape <pape@smarden.org>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
"Recai Oktaş" <roktas@debian.org>
Subject: Re: [PATCH/rfc] gitweb: open files (e.g. indextext.html) in utf8 mode
Date: Wed, 02 Jul 2008 06:37:38 -0700 (PDT) [thread overview]
Message-ID: <m3prpwflus.fsf@localhost.localdomain> (raw)
In-Reply-To: <20080702121317.10819.qmail@bca5b84cb0e0a0.315fe32.mid.smarden.org>
Gerrit Pape <pape@smarden.org> writes:
> From: =?utf-8?q?Recai=20Okta=C5=9F?= <roktas@debian.org>
You don't need to use quoted-printable in 'From:' header embedded in
the mail body. It should probably read
From: "Recai Oktaş" <roktas@debian.org>
(provided that you can use utf-8 in email).
> gitweb used to use utf8 only in stdout. As a result, included files
> like indextext.html appeared garbled if they contain utf8 characters.
> Now utf8 is also used when reading files.
It would better read as:
Gitweb used to use utf8 mode only on STDOUT (actually ":utf8" output
layer), relying on using to_utf8(...) to convert input data from uft8
to Perl internal form. As a result, included files such as $home_text
(indextext.html in default build configuration), or repository's
README.html appeared garbled if they did contain UTF-8 characters.
Now uft8 mode is used for all open invovations, also when reading files.
> The patch was submitted through
> http://bugs.debian.org/487465
>
Probably should have here
Reported-by: Recai Oktaş <roktas@debian.org>
> Signed-off-by: Gerrit Pape <pape@smarden.org>
> ---
> gitweb/gitweb.perl | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 90cd99b..96cb4e0 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -16,7 +16,7 @@ use Encode;
> use Fcntl ':mode';
> use File::Find qw();
> use File::Basename qw(basename);
> -binmode STDOUT, ':utf8';
> +use open qw(:std :utf8);
>
> BEGIN {
> CGI->compile() if $ENV{'MOD_PERL'};
It would be wonderfull if such simple solution worked. We would be
then able to remove to_utf8() subroutine and do not worry that we
forgot to convert some string to Perl internal encoding, which could
result to curring wide (non US-ASCII) UTF-8 character to be cut in
half. (On the other hand we wouldn't have $fallback_encoding).
Unfortunately there are two problem (or rather a problem and a half)
with this approach.
First is that with this patch gitweb doesn't pass gitweb test
t/t9500-gitweb-standalone-no-errors.sh (this is with perl v5.8.6)
* ok 63: encode(commit): utf8
* ok 64: encode(commit): iso-8859-1
* ok 65: encode(log): utf-8 and iso-8859-1
[...]
* FAIL 71: URL: no project URLs, no base URL
gitweb_run "p=.git;a=summary"
[Wed Jul 2 13:10:15 2008] gitweb.perl: utf8 "\xC4" does not map to Unicode \
at /path/to/git/t/trash directory/../../gitweb/gitweb.perl line 2298, \
<$fd> line 1.
[Wed Jul 2 13:10:15 2008] gitweb.perl: Malformed UTF-8 character \
(unexpected end of string) at [...]/gitweb/gitweb.perl line 2303, \
<$fd> line 1.
which is
open my $fd, '-|', git_cmd(), 'for-each-ref',
($limit ? '--count='.($limit+1) : ()), '--sort=-committerdate',
'--format=%(objectname) %(refname) %(subject)%00%(committer)',
'refs/heads'
or return;
2298: while (my $line = <$fd>) {
my %ref_item;
chomp $line;
my ($refinfo, $committerinfo) = split(/\0/, $line);
2303: my ($hash, $name, $title) = split(' ', $refinfo, 3);
Second, what is minimal Perl version and Perl configuration (installed
modules) that support "use open qw(:std :utf8);"? We do have some
minimal requirements for gitweb, and it would be nice if we didn't add
to them. But we already require PerlIO, so it probably doesn't matter.
--
Jakub Narebski
Poland
ShadeHawk on #git
next prev parent reply other threads:[~2008-07-02 13:38 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-02 12:13 [PATCH/rfc] gitweb: open files (e.g. indextext.html) in utf8 mode Gerrit Pape
2008-07-02 13:37 ` Jakub Narebski [this message]
2008-07-03 9:39 ` Lea Wiemann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m3prpwflus.fsf@localhost.localdomain \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pape@smarden.org \
--cc=roktas@debian.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.