From: Jakub Narebski <jnareb@gmail.com>
To: git@vger.kernel.org
Cc: "Jürgen Kreileder" <jk@blackdown.de>,
"John Hawley" <warthog9@kernel.org>, "Jeff King" <peff@peff.net>,
"Junio C Hamano" <gitster@pobox.com>
Subject: Re: [RFD] Handling of non-UTF8 data in gitweb
Date: Fri, 6 Jan 2012 17:35:31 +0100 [thread overview]
Message-ID: <201201061735.32908.jnareb@gmail.com> (raw)
In-Reply-To: <201112041709.32212.jnareb@gmail.com>
On Sun, 4 Dec 2011, Jakub Narebski wrote:
>
> Currently gitweb converts data it receives from git commands to Perl
> internal utf8 representation via to_utf8() subroutine
[...]
> Each part of data must be handled separately. It is quite error prone
> process, as can be seen from quite a number of patches that fix handling
> of UTF-8 data (latest from Jürgen).
>
>
> Much, much simpler would be to force opening of all files (including
> output pipes from git commands) in ':utf8' mode:
>
> use open qw(:std :utf8);
>
> [Note: perhaps instead of ':utf8' it should be ':encoding(UTF-8)'
> there...]
>
> But doing this would change gitweb behavior. [...]
[...]
> I don't know if people are relying on the old behavior. I guess
> it could be emulated by defining our own 'utf-8-with-fallback'
> encoding, or by defining our own PerlIO layer with PerlIO::via.
> But it no longer be simple solution (though still automatic).
I have now created simple Encode::UTF8WithFallback module, so that
use Encode::UTF8WithFallback;
use open IN => ':encoding(utf8-with-fallback)';
should be able to replace all calls to to_utf8() without any change
in behavior; at least simple tests shows that.
There however are two problems with this solution:
1. Encode::UTF8WithFallback should really be a separate Perl module
in a separate file (e.g. 'gitweb/lib/Encode/UTF8WithFallback.pm');
I was not able to make it work without a separate file.
This means that it very much requires the change that allows splitting
gitweb into many files and/or load extra helper modules, and/or require
extra non-core modules but provide and install them with gitweb if they
are not available. These changes are ready, and can be find in
'gitweb/split'
branch in my git.git repositories:
http://repo.or.cz/w/git/jnareb-git.git
https://github.com/jnareb/git
2. It turned out that the "open" pragma 1.04 from Perl v5.8.6 does not
work correctly. We need at least "open" 1.06 (version 1.05 consists
supposedly only of documentation-only change).
Because "open" is a core Perl module (core pragma), this means that
gitweb will require in practice Perl v5.8.9 at least, increasing
version requirement from current v5.8.0
--
Jakub Narebski
Poland
prev parent reply other threads:[~2012-01-06 16:35 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-04 16:09 [RFD] Handling of non-UTF8 data in gitweb Jakub Narebski
2011-12-06 1:07 ` Jeff King
2011-12-07 0:37 ` Junio C Hamano
2011-12-10 16:18 ` Jakub Narebski
2011-12-12 5:26 ` Junio C Hamano
2011-12-18 22:00 ` Jakub Narebski
2012-01-06 16:35 ` Jakub Narebski [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201201061735.32908.jnareb@gmail.com \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jk@blackdown.de \
--cc=peff@peff.net \
--cc=warthog9@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).